[PATCH 4.10 070/111] audit: fix auditd/kernel connection state tracking

Since I cannot find the original email on lkml, NAK on this.
__mutex_owner() is not a general purpose helper function.

Since this code also exists in the current kernel, I need to ask what
recommended alternatives exist for determining the mutex owner?
I imagine we could track the mutex owner separately in the audit
subsystem, but I'd much prefer to leverage an existing mechanism if
possible.

It's not at all clear to me what that code does, I just stumbled upon
__mutex_owner() outside of the mutex code itself and went WTF.

The comment (aside from having the most horribly style) is wrong too,
because it claims it will not block when we hold that lock, while,
afaict, it will in fact do just that.

Maybe if you could explain how that code is supposed to work and why it
doesn't know if it holds a lock I could make a suggestion...

Paul Moore

2018-02-20 14:51:08 UTC

Since I cannot find the original email on lkml, NAK on this.
__mutex_owner() is not a general purpose helper function.

Since this code also exists in the current kernel, I need to ask what
recommended alternatives exist for determining the mutex owner?
I imagine we could track the mutex owner separately in the audit
subsystem, but I'd much prefer to leverage an existing mechanism if
possible.

It's not at all clear to me what that code does, I just stumbled upon
__mutex_owner() outside of the mutex code itself and went WTF.

If you don't want people to use __mutex_owner() outside of the mutex
code I might suggest adding a rather serious comment at the top of the
function, because right now I don't see anything suggesting that
function shouldn't be used. Yes, there is the double underscore
prefix, but that can mean a few different things these days.

Post by Peter Zijlstra
The comment (aside from having the most horribly style) ...

Yeah, your dog is ugly too. Notice how neither comment is constructive?

Post by Peter Zijlstra
... is wrong too, because it claims it will not block when we hold that lock, while,
afaict, it will in fact do just that.

A mutex blocks when it is held, but the audit_log_start() function
should not block for the task that currently holds the
audit_cmd_mutex; that is what the comment is meant to convey. I
believe the comment makes sense, but I did write it so I'll concede
that I'm probably the not best judge. If anyone would like to offer a
different wording I'm happy to consider it.

Post by Peter Zijlstra
Maybe if you could explain how that code is supposed to work and why it
doesn't know if it holds a lock I could make a suggestion...

I just spent a few minutes looking back over the bits available in
include/linux/mutex.h and I'm not seeing anything beyond
__mutex_owner() which would allow us to determine the mutex owning
task. It's probably easiest for us to just track ownership ourselves.
I'll put together a patch later today.

--
paul moore
www.paul-moore.com

Peter Zijlstra

2018-02-20 15:18:49 UTC

Post by Peter Zijlstra
It's not at all clear to me what that code does, I just stumbled upon
__mutex_owner() outside of the mutex code itself and went WTF.

Find below.

Post by Peter Zijlstra
The comment (aside from having the most horribly style) ...

Yeah, your dog is ugly too. Notice how neither comment is constructive?

I'm sure you've seen this one:

https://lkml.org/lkml/2016/7/8/625

It's all about reading code; inconsistent and unbalanced styles are just
_really_ hard on the brain.

Post by Peter Zijlstra
... is wrong too, because it claims it will not block when we hold that lock, while,
afaict, it will in fact do just that.

The comment uses 'sleep' which is typically used to mean anything that
schedules, but then it does the schedule_timeout() thing.

Post by Peter Zijlstra
Maybe if you could explain how that code is supposed to work and why it
doesn't know if it holds a lock I could make a suggestion...

Note that up until recently the mutex implementation didn't even have a
consistent owner field. And the thing is, it's very easy to use wrong,
only today I've seen a patch do: "__mutex_owner() == task", where task
was allowed to be !current, which is just wrong.

Looking through kernel/audit.c I'm not even sure I see how you would end
up in audit_log_start() with audit_cmd_mutex held.

Can you give me a few code paths that trigger this? Simple git-grep is
failing me.

---
Subject: mutex: Add comment to __mutex_owner()
From: Peter Zijlstra <***@infradead.org>
Date: Tue Feb 20 16:01:36 CET 2018

Attempt to deter usage, this is not a public interface. It is entirely
possibly to implement a conformant mutex without having this owner
field (in fact, we used to have that).

Signed-off-by: Peter Zijlstra (Intel) <***@infradead.org>
---
--- a/include/linux/mutex.h
+++ b/include/linux/mutex.h
@@ -66,6 +66,11 @@ struct mutex {
#endif
};

+/*
+ * Internal helper function; C doesn't allow us to hide it :/
+ *
+ * DO NOT USE (outside of mutex code).
+ */
static inline struct task_struct *__mutex_owner(struct mutex *lock)
{
return (struct task_struct *)(atomic_long_read(&lock->owner) & ~0x07);

Paul Moore

2018-02-20 17:06:26 UTC

Post by Peter Zijlstra
It's not at all clear to me what that code does, I just stumbled upon
__mutex_owner() outside of the mutex code itself and went WTF.

Find below.

Post by Peter Zijlstra
The comment (aside from having the most horribly style) ...

Yeah, your dog is ugly too. Notice how neither comment is constructive?

https://lkml.org/lkml/2016/7/8/625

Yep. I stand behind my earlier comment in this thread.

Post by Peter Zijlstra
Maybe if you could explain how that code is supposed to work and why it
doesn't know if it holds a lock I could make a suggestion...

Arguably all the more reason why a strongly worded warning is
important (which I see you've included below, feel free to include my
Reviewed-by).

Post by Peter Zijlstra
Looking through kernel/audit.c I'm not even sure I see how you would end
up in audit_log_start() with audit_cmd_mutex held.
Can you give me a few code paths that trigger this? Simple git-grep is
failing me.

Basically look at the code in audit_receive_msg(), but I wasn't asking
your opinion on how we should rewrite the audit subsystem, I was just
asking how one could determine if the current task was holding a given
mutex in a way that was acceptable to you. Based on your comments,
and some further inspection of the mutex code, it appears that is/was
not something that the core mutex code wants to support/make-visible.
Which is perfectly fine, I just wanted to make sure I wasn't missing
something before I went ahead and wrote a wrapper around the mutex
code for use by audit.

FWIW, I just put together the following patch which removes the
__mutex_owner() call from audit and doesn't appear to break anything
on the audit side (you're CC'd on the patch). It has only been
lightly tested, but I'm going to bang on it for a day or so and if I
hear no objections I'll merge it into audit/next.

* https://www.redhat.com/archives/linux-audit/2018-February/msg00066.html

Post by Peter Zijlstra
---
Subject: mutex: Add comment to __mutex_owner()
Date: Tue Feb 20 16:01:36 CET 2018
Attempt to deter usage, this is not a public interface. It is entirely
possibly to implement a conformant mutex without having this owner
field (in fact, we used to have that).
---
--- a/include/linux/mutex.h
+++ b/include/linux/mutex.h
@@ -66,6 +66,11 @@ struct mutex {
#endif
};
+/*
+ * Internal helper function; C doesn't allow us to hide it :/
+ *
+ * DO NOT USE (outside of mutex code).
+ */
static inline struct task_struct *__mutex_owner(struct mutex *lock)
{
return (struct task_struct *)(atomic_long_read(&lock->owner) & ~0x07);

Reviewed-by: Paul Moore <***@paul-moore.com>

--
paul moore
www.paul-moore.com

Ingo Molnar

2018-02-21 08:46:02 UTC

Post by Peter Zijlstra
It's not at all clear to me what that code does, I just stumbled upon
__mutex_owner() outside of the mutex code itself and went WTF.

Find below.

Post by Peter Zijlstra
The comment (aside from having the most horribly style) ...

Yeah, your dog is ugly too. Notice how neither comment is constructive?

https://lkml.org/lkml/2016/7/8/625

Yep. I stand behind my earlier comment in this thread.

Post by Peter Zijlstra
Maybe if you could explain how that code is supposed to work and why it
doesn't know if it holds a lock I could make a suggestion...

Arguably all the more reason why a strongly worded warning is
important (which I see you've included below, feel free to include my
Reviewed-by).

Basically look at the code in audit_receive_msg(), but I wasn't asking
your opinion on how we should rewrite the audit subsystem, I was just
asking how one could determine if the current task was holding a given
mutex in a way that was acceptable to you. Based on your comments,
and some further inspection of the mutex code, it appears that is/was
not something that the core mutex code wants to support/make-visible.
Which is perfectly fine, I just wanted to make sure I wasn't missing
something before I went ahead and wrote a wrapper around the mutex
code for use by audit.
FWIW, I just put together the following patch which removes the
__mutex_owner() call from audit and doesn't appear to break anything
on the audit side (you're CC'd on the patch). It has only been
lightly tested, but I'm going to bang on it for a day or so and if I
hear no objections I'll merge it into audit/next.
* https://www.redhat.com/archives/linux-audit/2018-February/msg00066.html

Could you please explain the audit_ctl_lock()/unlock() primitive you are
introducing there? You seem to be implementing some sort of recursive locking
primitive, but in a strange way.

AFAICS the primary problem appears to be this code path:

audit_receive() -> audit_receive_msg() -> AUDIT_TTY_SET -> audit_log_common_recv_msg() -> audit_log_start()

where we can arrive already holding the lock.

I.e. recursive mutex, kinda.

What's the thinking there? Neither the changelog nor the code explains this.

Thanks,

Ingo

Peter Zijlstra

2018-02-21 09:15:04 UTC

Post by Ingo Molnar
audit_receive() -> audit_receive_msg() -> AUDIT_TTY_SET -> audit_log_common_recv_msg() -> audit_log_start()
where we can arrive already holding the lock.
I.e. recursive mutex, kinda.

I _think_ something like the below ought to work, but I've no idea how
to even begin testing audit.

---
kernel/audit.c | 31 ++++++++++++++++++++++++-------
1 file changed, 24 insertions(+), 7 deletions(-)

diff --git a/kernel/audit.c b/kernel/audit.c
index 227db99b0f19..24175754f79d 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -184,6 +184,9 @@ static char *audit_feature_names[2] = {
/* Serialize requests from userspace. */
DEFINE_MUTEX(audit_cmd_mutex);

+static struct audit_buffer *__audit_log_start(struct audit_context *ctx, gfp_t gfp_mask,
+ int type, bool recursive);
+
/* AUDIT_BUFSIZ is the size of the temporary buffer used for formatting
* audit records. Since printk uses a 1024 byte buffer, this buffer
* should be at least that large. */
@@ -357,7 +360,7 @@ static int audit_log_config_change(char *function_name, u32 new, u32 old,
struct audit_buffer *ab;
int rc = 0;

- ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_CONFIG_CHANGE);
+ ab = __audit_log_start(NULL, GFP_KERNEL, AUDIT_CONFIG_CHANGE, true);
if (unlikely(!ab))
return rc;
audit_log_format(ab, "%s=%u old=%u", function_name, new, old);
@@ -1024,7 +1027,7 @@ static void audit_log_common_recv_msg(struct audit_buffer **ab, u16 msg_type)
return;
}

- *ab = audit_log_start(NULL, GFP_KERNEL, msg_type);
+ *ab = __audit_log_start(NULL, GFP_KERNEL, msg_type, true);
if (unlikely(!*ab))
return;
audit_log_format(*ab, "pid=%d uid=%u", pid, uid);
@@ -1057,7 +1060,7 @@ static void audit_log_feature_change(int which, u32 old_feature, u32 new_feature
if (audit_enabled == AUDIT_OFF)
return;

- ab = audit_log_start(NULL, GFP_KERNEL, AUDIT_FEATURE_CHANGE);
+ ab = __audit_log_start(NULL, GFP_KERNEL, AUDIT_FEATURE_CHANGE, true);
audit_log_task_info(ab, current);
audit_log_format(ab, " feature=%s old=%u new=%u old_lock=%u new_lock=%u res=%d",
audit_feature_names[which], !!old_feature, !!new_feature,
@@ -1578,6 +1581,12 @@ static int __init audit_enable(char *str)

if (audit_default == AUDIT_OFF)
audit_initialized = AUDIT_DISABLED;
+ /*
+ * Normally audit_set_enabled() would need to be called under
+ * @audit_cmd_mutex, however since audit_do_config_change() will not in
+ * fact call audit_log_config_change() when 'audit_enabled ==
+ * AUDIT_OFF', we can use it here without issue.
+ */
if (audit_set_enabled(audit_default))
panic("audit: error setting audit state (%d)\n", audit_default);

@@ -1690,8 +1699,8 @@ static inline void audit_get_stamp(struct audit_context *ctx,
* will be written at syscall exit. If there is no associated task, then
* task context (ctx) should be NULL.
*/
-struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask,
- int type)
+static struct audit_buffer *__audit_log_start(struct audit_context *ctx, gfp_t gfp_mask,
+ int type, bool recursive)
{
struct audit_buffer *ab;
struct timespec64 t;
@@ -1703,6 +1712,9 @@ struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask,
if (unlikely(!audit_filter(type, AUDIT_FILTER_TYPE)))
return NULL;

+ if (recursive)
+ lockdep_assert_held(&audit_cmd_mutex);
+
/* NOTE: don't ever fail/sleep on these two conditions:
* 1. auditd generated record - since we need auditd to drain the
* queue; also, when we are checking for auditd, compare PIDs using
@@ -1710,8 +1722,7 @@ struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask,
* using a PID anchored in the caller's namespace
* 2. generator holding the audit_cmd_mutex - we don't want to block
* while holding the mutex */
- if (!(auditd_test_task(current) ||
- (current == __mutex_owner(&audit_cmd_mutex)))) {
+ if (!(auditd_test_task(current) || recursive)) {
long stime = audit_backlog_wait_time;

while (audit_backlog_limit &&
@@ -1753,6 +1764,12 @@ struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask,
return ab;
}

+struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask,
+ int type)
+{
+ return __audit_log_start(ctx, gfp_mask, type, false);
+}
+
/**
* audit_expand - expand skb in the audit buffer
* @ab: audit_buffer

Paul Moore

2018-02-21 23:04:30 UTC

Post by Ingo Molnar

Post by Peter Zijlstra
It's not at all clear to me what that code does, I just stumbled upon
__mutex_owner() outside of the mutex code itself and went WTF.

Find below.

Post by Peter Zijlstra
The comment (aside from having the most horribly style) ...

Yeah, your dog is ugly too. Notice how neither comment is constructive?

https://lkml.org/lkml/2016/7/8/625

Yep. I stand behind my earlier comment in this thread.