Discussion:
[PATCH 00/12] RFC: steps to make audit pid namespace-safe
(too old to reply)
Richard Guy Briggs
2013-08-20 21:40:24 UTC
Permalink
This patchset is a revival of some of Eric Biederman's work to make audit
pid-namespace-safe.

In a couple of places, audit was printing PIDs in the task's pid namespace
rather than relative to the audit daemon's pid namespace, which currently is
init_pid_ns.

It also allows processes to log audit user messages in their own pid
namespaces, which was not previously permitted. Please see:
https://bugzilla.redhat.com/show_bug.cgi?id=947530
https://bugs.launchpad.net/ubuntu/+source/vsftpd/+bug/1160372
https://bugzilla.novell.com/show_bug.cgi?id=786024

Part of the cleanup here involves deprecating task->pid and task->tgid, which
are error-prone duplicates of the task->pids structure

The next step which I hope to add to this patchset will be to purge task->pid
and task->tgid from the rest of the kernel if possible. Once that is done,
task_pid_nr_init_ns() and task_tgid_nr_init_ns() that were introduced in patch
05/12 and used in patches 06/12 and 08/12 could be replaced with task_pid_nr()
and task_tgid_nr(). Eric B. did take a stab at that, but checking all the
subtleties will be non-trivial.

Does anyone have any opinions or better yet hard data on cache line misses
between pid_nr(struct pid*) and pid_nr_ns(struct pid*, &init_pid_ns)? I'd
like to see pid_nr() use pid_nr_ns(struct pid*, &init_pid_ns), or
pid_nr_init_ns() eliminated in favour of the original pid_nr(). pid_nr()
currently accesses the first level of the pid structure without having to
dereference the level number. If there is an actual speed difference, it could
be worth keeping, otherwise, I'd prefer to simplify that code.

Eric also had a patch to add a printk option to format a struct pid pointer
which was PID namespace-aware. I don't see the point, but I'll let him explain
it.

Discuss.

Eric W. Biederman (5):
audit: Kill the unused struct audit_aux_data_capset
audit: Simplify and correct audit_log_capset

Richard Guy Briggs (7):
audit: fix netlink portid naming and types
pid: get ppid pid_t of task in init_pid_ns safely
audit: convert PPIDs to the inital PID namespace.
pid: get pid_t of task in init_pid_ns correctly
audit: store audit_pid as a struct pid pointer
audit: anchor all pid references in the initial pid namespace
pid: modify task_pid_nr to work without task->pid.
pid: modify task_tgid_nr to work without task->tgid.
pid: rewrite task helper functions avoiding task->pid and task->tgid
pid: mark struct task const in helper functions

drivers/tty/tty_audit.c | 3 +-
include/linux/audit.h | 8 ++--
include/linux/pid.h | 6 +++
include/linux/sched.h | 81 ++++++++++++++++++++++++----------
kernel/audit.c | 76 +++++++++++++++++++------------
kernel/audit.h | 12 +++---
kernel/auditfilter.c | 35 +++++++++++----
kernel/auditsc.c | 36 ++++++---------
kernel/capability.c | 2 +-
kernel/pid.c | 4 +-
security/apparmor/audit.c | 7 +--
security/integrity/integrity_audit.c | 2 +-
security/lsm_audit.c | 11 +++--
security/tomoyo/audit.c | 2 +-
14 files changed, 177 insertions(+), 108 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Oleg Nesterov
2013-08-22 19:15:28 UTC
Permalink
static inline int is_global_init(struct task_struct *tsk)
{
- return tsk->pid == 1;
+ return task_pid_nr(tsk) == 1;
}
Probably it would be better to simply kill it. Almost every usage is
wrong.
static inline bool is_idle_task(const struct task_struct *p)
{
- return p->pid == 0;
+ return task_pid(p) == &init_struct_pid;
}
hmm. there should be a simpler check for this...
static inline int has_group_leader_pid(struct task_struct *p)
{
- return p->pid == p->tgid;
+ return task_pid(p) == task_tgid(p);
}
static inline
int same_thread_group(struct task_struct *p1, struct task_struct *p2)
{
- return p1->tgid == p2->tgid;
+ return task_tgid(p1) == task_tgid(p2);
This is suboptinal. See the attached
include-linux-schedh-dont-use-task-pid-tgid-in-same_thread_group-has_group_leader_pid.patch
from -mm below.

Oleg.


--- a/include/linux/sched.h~include-linux-schedh-dont-use-task-pid-tgid-in-same_thread_group-has_group_leader_pid
+++ a/include/linux/sched.h
@@ -2179,15 +2179,15 @@ static inline bool thread_group_leader(s
* all we care about is that we have a task with the appropriate
* pid, we don't actually care if we have the right task.
*/
-static inline int has_group_leader_pid(struct task_struct *p)
+static inline bool has_group_leader_pid(struct task_struct *p)
{
- return p->pid == p->tgid;
+ return task_pid(p) == p->signal->leader_pid;
}

static inline
-int same_thread_group(struct task_struct *p1, struct task_struct *p2)
+bool same_thread_group(struct task_struct *p1, struct task_struct *p2)
{
- return p1->tgid == p2->tgid;
+ return p1->signal == p2->signal;
}

static inline struct task_struct *next_thread(const struct task_struct *p)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Richard Guy Briggs
2013-08-26 22:08:16 UTC
Permalink
Post by Oleg Nesterov
static inline int is_global_init(struct task_struct *tsk)
{
- return tsk->pid == 1;
+ return task_pid_nr(tsk) == 1;
}
Probably it would be better to simply kill it. Almost every usage is
wrong.
Can you be more clear? I don't follow. It should instead return a
boolean. Usage of is_global_init() or task_pid_nr()?

If is_global_init(), is that because they could be unaware of pid
namespaces?

If task_pid_nr(), is that for the same reason?
Post by Oleg Nesterov
static inline bool is_idle_task(const struct task_struct *p)
{
- return p->pid == 0;
+ return task_pid(p) == &init_struct_pid;
}
hmm. there should be a simpler check for this...
Other than the original, this one is pretty simple. What did you have
in mind?
Post by Oleg Nesterov
static inline int has_group_leader_pid(struct task_struct *p)
{
- return p->pid == p->tgid;
+ return task_pid(p) == task_tgid(p);
}
static inline
int same_thread_group(struct task_struct *p1, struct task_struct *p2)
{
- return p1->tgid == p2->tgid;
+ return task_tgid(p1) == task_tgid(p2);
This is suboptinal. See the attached
include-linux-schedh-dont-use-task-pid-tgid-in-same_thread_group-has_group_leader_pid.patch
from -mm below.
I'm fine with that if it is deemed better. The point was to remove the
dependence on task_struct::tgid.
Post by Oleg Nesterov
Oleg.
- RGB

--
Richard Guy Briggs <***@redhat.com>
Senior Software Engineer
Kernel Security
AMER ENG Base Operating Systems
Remote, Ottawa, Canada
Voice: +1.647.777.2635
Internal: (81) 32635
Alt: +1.613.693.0684x3545
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Oleg Nesterov
2013-08-27 16:22:11 UTC
Permalink
Post by Richard Guy Briggs
Post by Oleg Nesterov
static inline int is_global_init(struct task_struct *tsk)
{
- return tsk->pid == 1;
+ return task_pid_nr(tsk) == 1;
}
Probably it would be better to simply kill it. Almost every usage is
wrong.
Can you be more clear? I don't follow. It should instead return a
boolean. Usage of is_global_init() or task_pid_nr()?
Just look at the callers. For example, how is_global_init() can save
/sbin/init from oom-killer if it is multithreaded ?
Post by Richard Guy Briggs
If is_global_init(), is that because they could be unaware of pid
namespaces?
Because I think nobody actually needs is_a_group_leader_of_global_init(),
and this is what this helper actually is.
Post by Richard Guy Briggs
Post by Oleg Nesterov
static inline bool is_idle_task(const struct task_struct *p)
{
- return p->pid == 0;
+ return task_pid(p) == &init_struct_pid;
}
hmm. there should be a simpler check for this...
Other than the original, this one is pretty simple.
I meant that the original check is cheaper,
Post by Richard Guy Briggs
What did you have
in mind?
Well. I agree that it would be nice to avoid the dependence on task->pid
if possible. And perhaps even kill it eventually. But I am not sure how
much we should try.

If it was the last user of ->pid, then I would agree with this change.
Although we can make it cheaper, say, we can change idle_init() to
nullify tasks->next and use ->next == NULL.

But we have a lot more ->pid users, perhaps we should change them first.

And more importantly, let me repeat. I do not think that this change
should be mixed with other changes in this series.
Post by Richard Guy Briggs
Post by Oleg Nesterov
static inline int has_group_leader_pid(struct task_struct *p)
{
- return p->pid == p->tgid;
+ return task_pid(p) == task_tgid(p);
}
static inline
int same_thread_group(struct task_struct *p1, struct task_struct *p2)
{
- return p1->tgid == p2->tgid;
+ return task_tgid(p1) == task_tgid(p2);
This is suboptinal. See the attached
include-linux-schedh-dont-use-task-pid-tgid-in-same_thread_group-has_group_leader_pid.patch
from -mm below.
I'm fine with that if it is deemed better. The point was to remove the
dependence on task_struct::tgid.
But I agree! My only point was, this conflicts with the patch we already
have and that patch is more optimal. p1->leader == p2->leader is cheaper.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Richard Guy Briggs
2013-12-16 17:36:25 UTC
Permalink
Post by Richard Guy Briggs
Post by Oleg Nesterov
static inline int is_global_init(struct task_struct *tsk)
{
- return tsk->pid == 1;
+ return task_pid_nr(tsk) == 1;
}
Probably it would be better to simply kill it. Almost every usage is
wrong.
Can you be more clear? I don't follow. It should instead return a
boolean. Usage of is_global_init() or task_pid_nr()?
If is_global_init(), is that because they could be unaware of pid
namespaces?
If task_pid_nr(), is that for the same reason?
Oleg, I still don't understand your comment above. Kill what,
"is_global_init()"? If so, how is almost every usage of it wrong?

There are a number of functions that call is_global_init(). Might any
of them be called from inside the namespace context of a container and
hence should return true?
Post by Richard Guy Briggs
Post by Oleg Nesterov
static inline bool is_idle_task(const struct task_struct *p)
{
- return p->pid == 0;
+ return task_pid(p) == &init_struct_pid;
}
hmm. there should be a simpler check for this...
Other than the original, this one is pretty simple. What did you have
in mind?
I vaguely remember a clarification to this, but don't remember and can't
find it. What sort of simplification did you have in mind? I'd like to
go at least to:
task_pid_nr(p) == 0
Post by Richard Guy Briggs
Post by Oleg Nesterov
Oleg.
- RGB
- RGB

--
Richard Guy Briggs <***@redhat.com>
Senior Software Engineer, Kernel Security, AMER ENG Base Operating Systems, Red Hat
Remote, Ottawa, Canada
Voice: +1.647.777.2635, Internal: (81) 32635, Alt: +1.613.693.0684x3545
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Oleg Nesterov
2013-12-16 21:05:38 UTC
Permalink
Hi Richard,

Sorry, I already forgot the context, not sure I understand your email
correctly.
Post by Richard Guy Briggs
Post by Richard Guy Briggs
Post by Oleg Nesterov
static inline int is_global_init(struct task_struct *tsk)
{
- return tsk->pid == 1;
+ return task_pid_nr(tsk) == 1;
}
Probably it would be better to simply kill it. Almost every usage is
wrong.
Can you be more clear? I don't follow. It should instead return a
boolean. Usage of is_global_init() or task_pid_nr()?
If is_global_init(), is that because they could be unaware of pid
namespaces?
If task_pid_nr(), is that for the same reason?
Oleg, I still don't understand your comment above. Kill what,
"is_global_init()"? If so, how is almost every usage of it wrong?
Because is_global_init() is only true for the main thread of /sbin/init.

Just look at oom_unkillable_task(). It tries to not kill init. But, say,
select_bad_process() can happily find a sub-thread of is_global_init()
and still kill it.
Post by Richard Guy Briggs
There are a number of functions that call is_global_init(). Might any
of them be called from inside the namespace context of a container and
hence should return true?
Not sure I understand, but certainly some callers should check ->child_reaper
or SIGNAL_UNKILLABLE instead. Say, unhandled_signal().
Post by Richard Guy Briggs
Post by Richard Guy Briggs
Post by Oleg Nesterov
static inline bool is_idle_task(const struct task_struct *p)
{
- return p->pid == 0;
+ return task_pid(p) == &init_struct_pid;
}
hmm. there should be a simpler check for this...
Other than the original, this one is pretty simple. What did you have
in mind?
I vaguely remember a clarification to this, but don't remember and can't
find it. What sort of simplification did you have in mind?
I do not remember ;) Most probably, I meant "it would be nice to find a
simpler check".

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Richard Guy Briggs
2013-12-16 22:21:33 UTC
Permalink
Post by Oleg Nesterov
Hi Richard,
Sorry, I already forgot the context, not sure I understand your email
correctly.
Post by Richard Guy Briggs
Post by Richard Guy Briggs
Post by Oleg Nesterov
static inline int is_global_init(struct task_struct *tsk)
{
- return tsk->pid == 1;
+ return task_pid_nr(tsk) == 1;
}
Probably it would be better to simply kill it. Almost every usage is
wrong.
Can you be more clear? I don't follow. It should instead return a
boolean. Usage of is_global_init() or task_pid_nr()?
If is_global_init(), is that because they could be unaware of pid
namespaces?
If task_pid_nr(), is that for the same reason?
Oleg, I still don't understand your comment above. Kill what,
"is_global_init()"? If so, how is almost every usage of it wrong?
Because is_global_init() is only true for the main thread of /sbin/init.
Just look at oom_unkillable_task(). It tries to not kill init. But, say,
select_bad_process() can happily find a sub-thread of is_global_init()
and still kill it.
Ah! So it should be task_tgid_nr(tsk) == 1.
Post by Oleg Nesterov
Post by Richard Guy Briggs
There are a number of functions that call is_global_init(). Might any
of them be called from inside the namespace context of a container and
hence should return true?
Not sure I understand, but certainly some callers should check ->child_reaper
or SIGNAL_UNKILLABLE instead. Say, unhandled_signal().
So in some situations it should allow it to kill init of a container and
in others, refuse it.
Post by Oleg Nesterov
Post by Richard Guy Briggs
Post by Richard Guy Briggs
Post by Oleg Nesterov
static inline bool is_idle_task(const struct task_struct *p)
{
- return p->pid == 0;
+ return task_pid(p) == &init_struct_pid;
}
hmm. there should be a simpler check for this...
Other than the original, this one is pretty simple. What did you have
in mind?
I vaguely remember a clarification to this, but don't remember and can't
find it. What sort of simplification did you have in mind?
I do not remember ;) Most probably, I meant "it would be nice to find a
simpler check".
I'll stick with task_pid_nr(p) == 0.
Post by Oleg Nesterov
Oleg.
- RGB

--
Richard Guy Briggs <***@redhat.com>
Senior Software Engineer, Kernel Security, AMER ENG Base Operating Systems, Red Hat
Remote, Ottawa, Canada
Voice: +1.647.777.2635, Internal: (81) 32635, Alt: +1.613.693.0684x3545
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Peter Zijlstra
2013-12-17 09:35:00 UTC
Permalink
static inline bool is_idle_task(const struct task_struct *p)
{
- return p->pid == 0;
+ return task_pid(p) == &init_struct_pid;
}
I'll stick with task_pid_nr(p) == 0.
We're going to probably switch to:

return p->flags & PF_IDLE;

Soon, because people are playing silly tricks and want normal threads
to temporarily appear to be the idle thread (idle time injection).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Peter Zijlstra
2013-12-17 09:48:58 UTC
Permalink
Post by Peter Zijlstra
static inline bool is_idle_task(const struct task_struct *p)
{
- return p->pid == 0;
+ return task_pid(p) == &init_struct_pid;
}
I'll stick with task_pid_nr(p) == 0.
return p->flags & PF_IDLE;
Soon, because people are playing silly tricks and want normal threads
to temporarily appear to be the idle thread (idle time injection).
See http://marc.info/?l=linux-kernel&m=138548236506953&w=2 for more
context.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Richard Guy Briggs
2013-12-20 04:54:48 UTC
Permalink
Post by Peter Zijlstra
static inline bool is_idle_task(const struct task_struct *p)
{
- return p->pid == 0;
+ return task_pid(p) == &init_struct_pid;
}
I'll stick with task_pid_nr(p) == 0.
return p->flags & PF_IDLE;
Soon, because people are playing silly tricks and want normal threads
to temporarily appear to be the idle thread (idle time injection).
Ok, this addresses my concerns. I'll stay out of the way.

- RGB

--
Richard Guy Briggs <***@redhat.com>
Senior Software Engineer, Kernel Security, AMER ENG Base Operating Systems, Red Hat
Remote, Ottawa, Canada
Voice: +1.647.777.2635, Internal: (81) 32635, Alt: +1.613.693.0684x3545
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Peter Zijlstra
2013-08-22 20:06:21 UTC
Permalink
This stops these four task helper functions from using the deprecated and
error-prone task->pid and task->tgid.
(informed by ebiederman's ea5a4d01)
---
include/linux/sched.h | 8 ++++----
1 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 8e69807..46e739d 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1579,7 +1579,7 @@ static inline int pid_alive(struct task_struct *p)
*/
static inline int is_global_init(struct task_struct *tsk)
{
- return tsk->pid == 1;
+ return task_pid_nr(tsk) == 1;
}
extern struct pid *cad_pid;
@@ -1930,7 +1930,7 @@ extern struct task_struct *idle_task(int cpu);
*/
static inline bool is_idle_task(const struct task_struct *p)
{
- return p->pid == 0;
+ return task_pid(p) == &init_struct_pid;
}
extern struct task_struct *curr_task(int cpu);
extern void set_curr_task(int cpu, struct task_struct *p);
Why would you ever want to do this? It just makes these tests more
expensive for no gain what so ff'ing ever.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Richard Guy Briggs
2013-08-22 21:44:26 UTC
Permalink
Post by Peter Zijlstra
This stops these four task helper functions from using the deprecated and
error-prone task->pid and task->tgid.
(informed by ebiederman's ea5a4d01)
---
include/linux/sched.h | 8 ++++----
1 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 8e69807..46e739d 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1579,7 +1579,7 @@ static inline int pid_alive(struct task_struct *p)
*/
static inline int is_global_init(struct task_struct *tsk)
{
- return tsk->pid == 1;
+ return task_pid_nr(tsk) == 1;
}
extern struct pid *cad_pid;
@@ -1930,7 +1930,7 @@ extern struct task_struct *idle_task(int cpu);
*/
static inline bool is_idle_task(const struct task_struct *p)
{
- return p->pid == 0;
+ return task_pid(p) == &init_struct_pid;
}
extern struct task_struct *curr_task(int cpu);
extern void set_curr_task(int cpu, struct task_struct *p);
Why would you ever want to do this? It just makes these tests more
expensive for no gain what so ff'ing ever.
Backups are generally considered a good idea, but in this case, I'd
quote:
"A man with one watch knows what time it is. A man with two is
never certain."

Reminds me of the twist of a phrase frequently seen in the US gov:
"Government Duplicity, Do Not Propagate" ;-)


Can you suggest a safe way to live with this duplicity?


- RGB

--
Richard Guy Briggs <***@redhat.com>
Senior Software Engineer
Kernel Security
AMER ENG Base Operating Systems
Remote, Ottawa, Canada
Voice: +1.647.777.2635
Internal: (81) 32635
Alt: +1.613.693.0684x3545
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Peter Zijlstra
2013-08-23 06:36:36 UTC
Permalink
Post by Richard Guy Briggs
Post by Peter Zijlstra
This stops these four task helper functions from using the deprecated and
error-prone task->pid and task->tgid.
(informed by ebiederman's ea5a4d01)
---
include/linux/sched.h | 8 ++++----
1 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 8e69807..46e739d 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1579,7 +1579,7 @@ static inline int pid_alive(struct task_struct *p)
*/
static inline int is_global_init(struct task_struct *tsk)
{
- return tsk->pid == 1;
+ return task_pid_nr(tsk) == 1;
}
extern struct pid *cad_pid;
@@ -1930,7 +1930,7 @@ extern struct task_struct *idle_task(int cpu);
*/
static inline bool is_idle_task(const struct task_struct *p)
{
- return p->pid == 0;
+ return task_pid(p) == &init_struct_pid;
}
extern struct task_struct *curr_task(int cpu);
extern void set_curr_task(int cpu, struct task_struct *p);
Why would you ever want to do this? It just makes these tests more
expensive for no gain what so ff'ing ever.
Backups are generally considered a good idea, but in this case, I'd
"A man with one watch knows what time it is. A man with two is
never certain."
Except that's not the case, with namespaces there's a clear hierarchy
and the task_struct::pid is the one true value aka. root namespace.

Furthermore idle threads really are special and it doesn't make sense to
address them in any but the root namespace, doubly so because only
kernel space does this.

As for the init thread, that function is called is_global_init() for
crying out loud, what numb nut doesn't get that that's supposed to be
using the root namespace?

Seriously, you namespace guys should stop messing things up and
confusing yourselves and others.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Richard Guy Briggs
2013-08-27 02:38:00 UTC
Permalink
Post by Peter Zijlstra
Post by Richard Guy Briggs
Post by Peter Zijlstra
This stops these four task helper functions from using the deprecated and
error-prone task->pid and task->tgid.
(informed by ebiederman's ea5a4d01)
---
include/linux/sched.h | 8 ++++----
1 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 8e69807..46e739d 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1579,7 +1579,7 @@ static inline int pid_alive(struct task_struct *p)
*/
static inline int is_global_init(struct task_struct *tsk)
{
- return tsk->pid == 1;
+ return task_pid_nr(tsk) == 1;
}
extern struct pid *cad_pid;
@@ -1930,7 +1930,7 @@ extern struct task_struct *idle_task(int cpu);
*/
static inline bool is_idle_task(const struct task_struct *p)
{
- return p->pid == 0;
+ return task_pid(p) == &init_struct_pid;
}
extern struct task_struct *curr_task(int cpu);
extern void set_curr_task(int cpu, struct task_struct *p);
Why would you ever want to do this? It just makes these tests more
expensive for no gain what so ff'ing ever.
Backups are generally considered a good idea, but in this case, I'd
"A man with one watch knows what time it is. A man with two is
never certain."
Except that's not the case, with namespaces there's a clear hierarchy
and the task_struct::pid is the one true value aka. root namespace.
Peter, I agonized over the access efficiency of dropping this one or the
duplicate in task_struct::pids and this one was far easier to drop in
terms of somehow always forcing
task->pids[PIDTYPE_PID].pid->numbers[0].nr to point to task->pid.

It should be possible to audit the kernel to make certain task->pid is
only ever written at the time of task creation and copied to its own
task->pids[PIDTYPE_PID].pid->numbers[0].nr at the time of task creation
so that the two values are consistent. Continuously auditing the kernel
so this is the case would be a bit more of a challenge.

Would it be reasonable to suggest task_struct::pid only be accessed by
the existing inlined task_pid_nr() converted to const?

The goal is to gain assurance that any PIDs referred to in audit logs
are indisputable.

Would you be alright with doing away with task_struct::tgid?
Post by Peter Zijlstra
Furthermore idle threads really are special and it doesn't make sense to
address them in any but the root namespace, doubly so because only
kernel space does this.
If that is the case, and the above holds true, then we don't need the
second hunk, I agree.
Post by Peter Zijlstra
As for the init thread, that function is called is_global_init() for
crying out loud, what numb nut doesn't get that that's supposed to be
using the root namespace?
A process in another pid namespace? If that's never going to happen,
then drop it.
Post by Peter Zijlstra
Seriously, you namespace guys should stop messing things up and
confusing yourselves and others.
"you namespace guys"? I'm not a namespace guy. I'm a rusty kernel
network security guy taking on audit, trying to prepare it for moving
into a more and more namespace-enabled place of
containerization/light-virtualization.

We aren't ready for it yet, but there is demand to run additional auditd
daemons in other pid namespaces and some of this work is trying to move
in that direction.

This patchset certainly wasn't intended to be ready for adoption just
yet. It was this sort of discussion I was hoping to have.


- RGB

--
Richard Guy Briggs <***@redhat.com>
Senior Software Engineer
Kernel Security
AMER ENG Base Operating Systems
Remote, Ottawa, Canada
Voice: +1.647.777.2635
Internal: (81) 32635
Alt: +1.613.693.0684x3545
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Peter Zijlstra
2013-08-27 12:11:50 UTC
Permalink
Post by Richard Guy Briggs
Post by Peter Zijlstra
Except that's not the case, with namespaces there's a clear hierarchy
and the task_struct::pid is the one true value aka. root namespace.
Peter, I agonized over the access efficiency of dropping this one or the
duplicate in task_struct::pids and this one was far easier to drop in
terms of somehow always forcing
task->pids[PIDTYPE_PID].pid->numbers[0].nr to point to task->pid.
You mean there's more than 1 site that sets task_struct::pid? I thought
we only assign that thing once in fork.c someplace.
Post by Richard Guy Briggs
It should be possible to audit the kernel to make certain task->pid is
only ever written at the time of task creation and copied to its own
task->pids[PIDTYPE_PID].pid->numbers[0].nr at the time of task creation
so that the two values are consistent. Continuously auditing the kernel
so this is the case would be a bit more of a challenge.
I know there's people running scripts over git commits to catch abuse,
if this is scriptable that might be doable.
Post by Richard Guy Briggs
Would it be reasonable to suggest task_struct::pid only be accessed by
the existing inlined task_pid_nr() converted to const?
Sure that works for me, alternatively what's wrong with making
task_struct::pid itself a const pid_t ? Then assignment will need an
ugly cast to even work.
Post by Richard Guy Briggs
The goal is to gain assurance that any PIDs referred to in audit logs
are indisputable.
Would you be alright with doing away with task_struct::tgid?
So I don't particularly use that one much -- if at all. So no I don't
mind it too much.
Post by Richard Guy Briggs
Post by Peter Zijlstra
Furthermore idle threads really are special and it doesn't make sense to
address them in any but the root namespace, doubly so because only
kernel space does this.
If that is the case, and the above holds true, then we don't need the
second hunk, I agree.
It should be the case -- not entirely sure it is the case, but in any
case pid-0 should be 'special' by all accounts.
Post by Richard Guy Briggs
Post by Peter Zijlstra
As for the init thread, that function is called is_global_init() for
crying out loud, what numb nut doesn't get that that's supposed to be
using the root namespace?
A process in another pid namespace? If that's never going to happen,
then drop it.
That'd be a bug I suppose, you want the 'global' init when using that
function. I don't use this function, never have. So I really don't know
_that_ much about it. It just seems to me that the name really implies
it wants the root init process and not any other.
Post by Richard Guy Briggs
Post by Peter Zijlstra
Seriously, you namespace guys should stop messing things up and
confusing yourselves and others.
"you namespace guys"? I'm not a namespace guy. I'm a rusty kernel
network security guy taking on audit, trying to prepare it for moving
into a more and more namespace-enabled place of
containerization/light-virtualization.
Well, you let yourself in with 'those' people ;-)
Post by Richard Guy Briggs
We aren't ready for it yet, but there is demand to run additional auditd
daemons in other pid namespaces and some of this work is trying to move
in that direction.
So afaict that's 'simply' writing the 'right' pid to your logger, right?
Your additional concern that the pid space isn't corrupted sounds a bit
superfluous to me, we don't typically muck about with pids, stuff would
horribly break if we did that.

There's a few special cases, like the idle threads having pid-0 and
'simple' people like myself prefer to use task_struct::pid for debugging
when we run our simple kernels without all this namespace stuff enabled.

The entire task->pids[PIDTYPE_PID].pid->numbers[0].nr thing just seems
increddibly unwieldy and double dereferences, even if the lines are
'hot' are worse than single derefs.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Eric W. Biederman
2013-08-27 21:35:37 UTC
Permalink
Post by Peter Zijlstra
Post by Richard Guy Briggs
Post by Peter Zijlstra
Except that's not the case, with namespaces there's a clear hierarchy
and the task_struct::pid is the one true value aka. root namespace.
Peter, I agonized over the access efficiency of dropping this one or the
duplicate in task_struct::pids and this one was far easier to drop in
terms of somehow always forcing
task->pids[PIDTYPE_PID].pid->numbers[0].nr to point to task->pid.
You mean there's more than 1 site that sets task_struct::pid? I thought
we only assign that thing once in fork.c someplace.
No there is not and that is not a concern.

Now I had thought given how the perf subsystem was constructed that only
the god like root was even allowed to use the code. But it turns out
there is sysctl_perf_event_paranoid that can bet twiddled that in some
circumstance that unprivileged users are allowed to use perf. Which
ultimately means perf will return the wrong data.

Meaning that perf is broken by design and perf has no excuse to be using
task->pid. Similarly every other place in the kernel that has made the
same mistake. I mention perf explicitly for two reasons. perf gets the
namespace handling horribly wrong, perf is the only place in the kernel
where we are accessing pids frequently enough for an extra cache line
miss to be a concern.

When really pids in the kernel what we care about is not some stupid
number but the stuct pid which points to that tasks, process groups, and
sessions. You know the object that a pid is the name for.

So yes as a long term direction task->pid and task->tgid need to be
killed because we keep getting subsystems like perf that return the
wrong data to userspace, or perform the wrong checks, because the
current structure makes it seem like it is ok to do the wrong thing.

Now that should not be Richard's fight. Hopefully he can focus on
fixing audit.
Post by Peter Zijlstra
There's a few special cases, like the idle threads having pid-0 and
'simple' people like myself prefer to use task_struct::pid for debugging
when we run our simple kernels without all this namespace stuff enabled.
Which is why a special printf format is likely a good idea because it
means you can easily print pids without needing to call ungainly helper
functions.

Of course you can't run kernels without this ``namespace'' stuff
enabled. The best you can do is run kernels without the ability to
create new instances of the namespaces.
Post by Peter Zijlstra
The entire task->pids[PIDTYPE_PID].pid->numbers[0].nr thing just seems
increddibly unwieldy and double dereferences, even if the lines are
'hot' are worse than single derefs.
But it is so much better than having to look up task->pid in a hash
table to get anything done, which is the previous state of affairs.

When the pid namespace support was merged except for a few overlooked
corner cases everything was converted except a bunch of printk
statements. Now I look in the kernel and we have had subsystems added
that totally get the namespace handling wrong because it is easy and
apparently socially acceptable to mess up other peoples hard work.

Apparently even Linus yelling at people a few years back wasn't enough
for people to wake up and be responsible developers and use proper
abstractions. So the only valid long term direction I can see is to
remove all of the abstractions that make getting pid handling wrong,
and to fix all of the code that gets it wrong today. So that there are
no more bad examples in the kernel.

This isn't Richard's fight, and this isn't what needs to happen with
audit. Audit just needs to be fixed so that so that it reports pid
numbers the audit daemon can make sense of, and to do that the audit
should use helper functions that are pid namespace aware and make it
clear that the proper pid namespace is being used.

In the long term ->pid and ->tgid must die, and take all of this wrong
think with it.

Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Peter Zijlstra
2013-08-28 08:16:49 UTC
Permalink
Post by Eric W. Biederman
Post by Peter Zijlstra
Post by Richard Guy Briggs
Post by Peter Zijlstra
Except that's not the case, with namespaces there's a clear hierarchy
and the task_struct::pid is the one true value aka. root namespace.
Peter, I agonized over the access efficiency of dropping this one or the
duplicate in task_struct::pids and this one was far easier to drop in
terms of somehow always forcing
task->pids[PIDTYPE_PID].pid->numbers[0].nr to point to task->pid.
You mean there's more than 1 site that sets task_struct::pid? I thought
we only assign that thing once in fork.c someplace.
No there is not and that is not a concern.
Now I had thought given how the perf subsystem was constructed that only
the god like root was even allowed to use the code. But it turns out
there is sysctl_perf_event_paranoid that can bet twiddled that in some
circumstance that unprivileged users are allowed to use perf.
Even without poking at that, a user is always allowed to use perf on his
own tasks.
Post by Eric W. Biederman
Which
ultimately means perf will return the wrong data.
How so, perf uses pid-namespaces, have a look at
kernel/events/core.c:perf_event_[pt]id(). We store the namespace of the
task creating the event (which is also -- hopefully -- the consumer of
the data it generates). If you create an event and then switch
namespaces you've bigger issues I suppose.
Post by Eric W. Biederman
Meaning that perf is broken by design and perf has no excuse to be using
task->pid.
It doesn't.
Post by Eric W. Biederman
Similarly every other place in the kernel that has made the
same mistake. I mention perf explicitly for two reasons. perf gets the
namespace handling horribly wrong,
Do tell.
Post by Eric W. Biederman
perf is the only place in the kernel
where we are accessing pids frequently enough for an extra cache line
miss to be a concern.
When really pids in the kernel what we care about is not some stupid
number but the stuct pid which points to that tasks, process groups, and
sessions. You know the object that a pid is the name for.
So yes as a long term direction task->pid and task->tgid need to be
killed because we keep getting subsystems like perf that return the
wrong data to userspace, or perform the wrong checks, because the
current structure makes it seem like it is ok to do the wrong thing.
I think you should have a look at code before you start raving.. makes
you looks silly.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Oleg Nesterov
2013-08-23 19:34:33 UTC
Permalink
Post by Richard Guy Briggs
Post by Peter Zijlstra
Why would you ever want to do this? It just makes these tests more
expensive for no gain what so ff'ing ever.
Backups are generally considered a good idea, but in this case, I'd
And perhaps you are right. At least we can probably kill task->tgid.
And I agree, it would be nice to kill task->pid.

But: I also agree with Peter, we should try to not make the current
code more expensive.

Anyway. Imho, you should not mix the different things in one series.
If you want to fix audit, do not add the patches like 10/12 or 11/12
into this series.

Or 3/12. OK, I agree sys_getppid() in audit_log_task_info() looks
strange at least. Just fix it using the helpers we already have and
add the new helpers later. Or send the patch(es) which adds the new
helpers first.

Or task_pid_nr_init_ns()... For what? We already have task_pid_nr().
Use the helper we already have, or introduce the new one first and
change the current users of task_pid_nr().

In short. Fortunately you do not need to convince me, I do not
maintain audit or namespaces ;) But imho this series looks a bit
confusing.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Richard Guy Briggs
2013-08-27 03:05:34 UTC
Permalink
Post by Oleg Nesterov
Post by Richard Guy Briggs
Post by Peter Zijlstra
Why would you ever want to do this? It just makes these tests more
expensive for no gain what so ff'ing ever.
Backups are generally considered a good idea, but in this case, I'd
And perhaps you are right. At least we can probably kill task->tgid.
And I agree, it would be nice to kill task->pid.
But: I also agree with Peter, we should try to not make the current
code more expensive.
I don't disagree. I was given some hope by Eric Biederman that it might
not cause cache-line misses...
Post by Oleg Nesterov
Anyway. Imho, you should not mix the different things in one series.
If you want to fix audit, do not add the patches like 10/12 or 11/12
into this series.
They were included to gain reassurance that PIDs reported in audit logs
were accurate. If those assurances can be made, then I can limit my
work to audit itself.
Post by Oleg Nesterov
Or 3/12.
That is a cleanup to make clear what parts are actually pid-related and
what isn't.
Post by Oleg Nesterov
OK, I agree sys_getppid() in audit_log_task_info() looks
strange at least. Just fix it using the helpers we already have and
add the new helpers later. Or send the patch(es) which adds the new
helpers first.
Patch 04/12 is that helper. It is used in only two places in audit (and
once in apparmor), so I could have duplicated the code, but since it
needs rcu locking, an inline funciton in the audit namespace seemed
somewhat pointless when it could just as easily be shared with other
subsystems.
Post by Oleg Nesterov
Or task_pid_nr_init_ns()... For what? We already have task_pid_nr().
Use the helper we already have, or introduce the new one first and
change the current users of task_pid_nr().
If task_struct::pid is definitely not going away, then that whole part
is moot and we'll just use task_pid_nr() as is.
Post by Oleg Nesterov
In short. Fortunately you do not need to convince me, I do not
maintain audit or namespaces ;) But imho this series looks a bit
confusing.
It is incomplete. The last step(s) were intended to purge
task_struct::pid (or abstract it using task_pid_nr()), which haven't
been submitted yet. The whole point of the patchset was to give
confidence in the PIDs reported in any audit logs.
Post by Oleg Nesterov
Oleg.
- RGB

--
Richard Guy Briggs <***@redhat.com>
Senior Software Engineer
Kernel Security
AMER ENG Base Operating Systems
Remote, Ottawa, Canada
Voice: +1.647.777.2635
Internal: (81) 32635
Alt: +1.613.693.0684x3545
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Oleg Nesterov
2013-08-27 17:18:06 UTC
Permalink
Post by Richard Guy Briggs
Post by Oleg Nesterov
Or 3/12.
That is a cleanup to make clear what parts are actually pid-related and
what isn't.
You know, I decided to send another email about this patch. This cleanup
doesn't look even correct.
Post by Richard Guy Briggs
Post by Oleg Nesterov
OK, I agree sys_getppid() in audit_log_task_info() looks
strange at least. Just fix it using the helpers we already have and
add the new helpers later. Or send the patch(es) which adds the new
helpers first.
Patch 04/12 is that helper. It is used in only two places in audit
I see what 3/4 do. But I am not sure we need this. At least in this
series.

OK, why do we need 3 new helper? audit needs only one,
task_ppid_nr_init_ns(). And who else needs this task_ppid* stuff?
Post by Richard Guy Briggs
once in apparmor),
OK, apparmor. So perhaps a single new helper in sched.c makes sense,
I dunno. But see above.
Post by Richard Guy Briggs
Post by Oleg Nesterov
Or task_pid_nr_init_ns()... For what? We already have task_pid_nr().
Use the helper we already have, or introduce the new one first and
change the current users of task_pid_nr().
If task_struct::pid is definitely not going away, then that whole part
is moot and we'll just use task_pid_nr() as is.
Can't understand. We already have task_tgid_nr(). This helper can be
changed to avoid ->tgig. Why task_ppid_nr_init_ns() can't use the
helper we already have?

But let me repeat. I am not maintainer and I do not really care.
You should convince Eric, I am not going to argue.


Btw. audit looks unmaintained... if you are going to take care of
this code, perhaps you can look at

http://marc.info/?l=linux-kernel&m=137589907108485
http://marc.info/?l=linux-kernel&m=137590271809664

?

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Richard Guy Briggs
2013-08-30 19:06:57 UTC
Permalink
Post by Oleg Nesterov
Btw. audit looks unmaintained... if you are going to take care of
this code, perhaps you can look at
http://marc.info/?l=linux-kernel&m=137589907108485
http://marc.info/?l=linux-kernel&m=137590271809664
(I don't want to lose these refs... First I've seen these.)

Why do you say this? Could you elaborate? Due to the state of the code
itself, or the lack of response from audit folks? (Like most, I'm not
subscribed to that firehose, so I don't have archives that show
addressees.) Most of the kernel audit folks are on
Post by Oleg Nesterov
Oleg.
- RGB

--
Richard Guy Briggs <***@redhat.com>
Senior Software Engineer
Kernel Security
AMER ENG Base Operating Systems
Remote, Ottawa, Canada
Voice: +1.647.777.2635
Internal: (81) 32635
Alt: +1.613.693.0684x3545
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Steve Grubb
2013-08-30 19:55:02 UTC
Permalink
Post by Oleg Nesterov
Btw. audit looks unmaintained... if you are going to take care of
this code, perhaps you can look at
http://marc.info/?l=linux-kernel&m=137589907108485
http://marc.info/?l=linux-kernel&m=137590271809664
You don't want to clear the TIF audit flag when context == NULL. What that will
do is make a bunch of inauditable processes. There are times when audit is
disabled and then re-enabled later. If the flag gets cleared, then a task's
syscall will never enter the auditing framework from kernel/entry_64.S.

That flag is 0 when auditing has never ever been enabled. If auditing is
enabled, it should always be a 1 unless the task filter has determined that
this process should not be audited ever. In practice, this is almost never
used. But ensuring the TIF_SYSCALL_AUDIT set to 1 on all processes is why we
have the boot argument. Not setting audit=1 on the boot arguments means that
any process running before the audit daemon enables auditing can never ever be
audited because the only place its set is when processes are cloned.

Hope this clears up the use. NAK to the patch, it'll break auditing.

-Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Oleg Nesterov
2013-09-08 16:00:49 UTC
Permalink
Sorry for delay, vacation.

First of all, I do not pretend I understand this code. This was mostly
the question, and in fact I mostly asked about audit_bprm() in 0/1.

However,
Post by Steve Grubb
Post by Oleg Nesterov
Btw. audit looks unmaintained... if you are going to take care of
this code, perhaps you can look at
http://marc.info/?l=linux-kernel&m=137589907108485
http://marc.info/?l=linux-kernel&m=137590271809664
You don't want to clear the TIF audit flag when context == NULL. What that will
do is make a bunch of inauditable processes. There are times when audit is
disabled and then re-enabled later. If the flag gets cleared, then a task's
syscall will never enter the auditing framework from kernel/entry_64.S.
That flag is 0 when auditing has never ever been enabled. If auditing is
enabled, it should always be a 1 unless the task filter has determined that
this process should not be audited ever. In practice, this is almost never
used. But ensuring the TIF_SYSCALL_AUDIT set to 1 on all processes is why we
have the boot argument. Not setting audit=1 on the boot arguments means that
any process running before the audit daemon enables auditing can never ever be
audited because the only place its set is when processes are cloned.
Then why audit_alloc() doesn't set TIF_SYSCALL_AUDIT unconditionally?

And I do not understand "when context == NULL" above. Say, audit_syscall_entry()
does nothing if !audit_context, and nobody except copy_process() does
audit_alloc(). So why do we need to trigger the audit's paths if it is NULL?
Post by Steve Grubb
Hope this clears up the use. NAK to the patch, it'll break auditing.
Not really, but thanks for your reply anyway.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Oleg Nesterov
2013-09-10 17:26:50 UTC
Permalink
Post by Oleg Nesterov
First of all, I do not pretend I understand this code. This was mostly
the question, and in fact I mostly asked about audit_bprm() in 0/1.
However,
Post by Steve Grubb
Post by Oleg Nesterov
Btw. audit looks unmaintained... if you are going to take care of
this code, perhaps you can look at
http://marc.info/?l=linux-kernel&m=137589907108485
http://marc.info/?l=linux-kernel&m=137590271809664
You don't want to clear the TIF audit flag when context == NULL. What that will
do is make a bunch of inauditable processes. There are times when audit is
disabled and then re-enabled later. If the flag gets cleared, then a task's
syscall will never enter the auditing framework from kernel/entry_64.S.
That flag is 0 when auditing has never ever been enabled. If auditing is
enabled, it should always be a 1 unless the task filter has determined that
this process should not be audited ever. In practice, this is almost never
used. But ensuring the TIF_SYSCALL_AUDIT set to 1 on all processes is why we
have the boot argument. Not setting audit=1 on the boot arguments means that
any process running before the audit daemon enables auditing can never ever be
audited because the only place its set is when processes are cloned.
Then why audit_alloc() doesn't set TIF_SYSCALL_AUDIT unconditionally?
And I do not understand "when context == NULL" above. Say, audit_syscall_entry()
does nothing if !audit_context, and nobody except copy_process() does
audit_alloc(). So why do we need to trigger the audit's paths if it is NULL?
Post by Steve Grubb
Hope this clears up the use. NAK to the patch, it'll break auditing.
Not really, but thanks for your reply anyway.
So, Steve, do you still think that patch was wrong? Attached below
just in case.

Oleg.

[PATCH 1/1] audit_alloc: clear TIF_SYSCALL_AUDIT if !audit_context

If audit_filter_task() nacks the new thread it makes sense
to clear TIF_SYSCALL_AUDIT which can be copied from parent
by dup_task_struct().

A wrong TIF_SYSCALL_AUDIT is not really bad, but it triggers
the "slow" audit paths in entry.S.

Signed-off-by: Oleg Nesterov <***@redhat.com>
---
kernel/auditsc.c | 4 +++-
1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index 9845cb3..95293ab 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -943,8 +943,10 @@ int audit_alloc(struct task_struct *tsk)
return 0; /* Return if not auditing. */

state = audit_filter_task(tsk, &key);
- if (state == AUDIT_DISABLED)
+ if (state == AUDIT_DISABLED) {
+ clear_tsk_thread_flag(tsk, TIF_SYSCALL_AUDIT);
return 0;
+ }

if (!(context = audit_alloc_context(state))) {
kfree(key);
--
1.5.5.1


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Steve Grubb
2013-09-13 18:43:06 UTC
Permalink
Post by Oleg Nesterov
Post by Oleg Nesterov
First of all, I do not pretend I understand this code. This was mostly
the question, and in fact I mostly asked about audit_bprm() in 0/1.
However,
Post by Steve Grubb
Post by Oleg Nesterov
Btw. audit looks unmaintained... if you are going to take care of
this code, perhaps you can look at
http://marc.info/?l=linux-kernel&m=137589907108485
http://marc.info/?l=linux-kernel&m=137590271809664
You don't want to clear the TIF audit flag when context == NULL. What
that will do is make a bunch of inauditable processes. There are times
when audit is disabled and then re-enabled later. If the flag gets
cleared, then a task's syscall will never enter the auditing framework
from kernel/entry_64.S.
That flag is 0 when auditing has never ever been enabled. If auditing is
enabled, it should always be a 1 unless the task filter has determined that
this process should not be audited ever. In practice, this is almost never
used. But ensuring the TIF_SYSCALL_AUDIT set to 1 on all processes is
why we have the boot argument. Not setting audit=1 on the boot
arguments means that any process running before the audit daemon
enables auditing can never ever be audited because the only place its
set is when processes are cloned.>
Then why audit_alloc() doesn't set TIF_SYSCALL_AUDIT unconditionally?
And I do not understand "when context == NULL" above. Say,
audit_syscall_entry() does nothing if !audit_context, and nobody except
copy_process() does audit_alloc(). So why do we need to trigger the
audit's paths if it is NULL?>
Post by Steve Grubb
Hope this clears up the use. NAK to the patch, it'll break auditing.
Not really, but thanks for your reply anyway.
So, Steve, do you still think that patch was wrong? Attached below
just in case.
I think this looks OK. If the task filter NACK's auditing the process, then
clearing the flag is probably correct. I have design notes from back around the
2.6.7 kernel saying this was the intention.

ACK.

-Steve
Post by Oleg Nesterov
[PATCH 1/1] audit_alloc: clear TIF_SYSCALL_AUDIT if !audit_context
If audit_filter_task() nacks the new thread it makes sense
to clear TIF_SYSCALL_AUDIT which can be copied from parent
by dup_task_struct().
A wrong TIF_SYSCALL_AUDIT is not really bad, but it triggers
the "slow" audit paths in entry.S.
---
kernel/auditsc.c | 4 +++-
1 files changed, 3 insertions(+), 1 deletions(-)
diff --git a/kernel/auditsc.c b/kernel/auditsc.c
index 9845cb3..95293ab 100644
--- a/kernel/auditsc.c
+++ b/kernel/auditsc.c
@@ -943,8 +943,10 @@ int audit_alloc(struct task_struct *tsk)
return 0; /* Return if not auditing. */
state = audit_filter_task(tsk, &key);
- if (state == AUDIT_DISABLED)
+ if (state == AUDIT_DISABLED) {
+ clear_tsk_thread_flag(tsk, TIF_SYSCALL_AUDIT);
return 0;
+ }
if (!(context = audit_alloc_context(state))) {
kfree(key);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Oleg Nesterov
2013-09-14 18:16:20 UTC
Permalink
Post by Steve Grubb
Post by Oleg Nesterov
So, Steve, do you still think that patch was wrong? Attached below
just in case.
I think this looks OK. If the task filter NACK's auditing the process, then
clearing the flag is probably correct. I have design notes from back around the
2.6.7 kernel saying this was the intention.
Then I do not really understand your previous email... Nevermind ;)
Post by Steve Grubb
ACK.
Thanks. I'll resend this patch with your ack applied.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Steve Grubb
2013-09-13 18:28:27 UTC
Permalink
Post by Oleg Nesterov
Sorry for delay, vacation.
First of all, I do not pretend I understand this code. This was mostly
the question, and in fact I mostly asked about audit_bprm() in 0/1.
However,
Post by Steve Grubb
Post by Oleg Nesterov
Btw. audit looks unmaintained... if you are going to take care of
this code, perhaps you can look at
http://marc.info/?l=linux-kernel&m=137589907108485
http://marc.info/?l=linux-kernel&m=137590271809664
You don't want to clear the TIF audit flag when context == NULL. What that
will do is make a bunch of inauditable processes. There are times when
audit is disabled and then re-enabled later. If the flag gets cleared,
then a task's syscall will never enter the auditing framework from
kernel/entry_64.S.
That flag is 0 when auditing has never ever been enabled. If auditing is
enabled, it should always be a 1 unless the task filter has determined that
this process should not be audited ever. In practice, this is almost never
used. But ensuring the TIF_SYSCALL_AUDIT set to 1 on all processes is why
we have the boot argument. Not setting audit=1 on the boot arguments
means that any process running before the audit daemon enables auditing
can never ever be audited because the only place its set is when
processes are cloned.
Then why audit_alloc() doesn't set TIF_SYSCALL_AUDIT unconditionally?
The code I'm looking at does right at the end of the function.
Post by Oleg Nesterov
And I do not understand "when context == NULL" above. Say,
audit_syscall_entry() does nothing if !audit_context, and nobody except
copy_process() does audit_alloc(). So why do we need to trigger the audit's
paths if it is NULL?
Because if you enter the audit framework, that means auditing has been turned
on at some point in the past, and could be turned back on at some point in the
future. If auditing has never been enabled, then you never enter the audit
framework at all. If the context is NULL, then audit is not currently enabled.

-Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Oleg Nesterov
2013-09-14 18:15:14 UTC
Permalink
Post by Steve Grubb
Post by Oleg Nesterov
Then why audit_alloc() doesn't set TIF_SYSCALL_AUDIT unconditionally?
The code I'm looking at does right at the end of the function.
The code I'm looking at does right at the end too ;) but it also
returns at the start if audit_filter_task() returns AUDIT_DISABLED.
Post by Steve Grubb
Post by Oleg Nesterov
And I do not understand "when context == NULL" above. Say,
audit_syscall_entry() does nothing if !audit_context, and nobody except
copy_process() does audit_alloc(). So why do we need to trigger the audit's
paths if it is NULL?
Because if you enter the audit framework,
framework? TIF_SYSCALL_AUDIT has only meaning in entry.S, we need it
to ensure that the audited task can't miss audit_syscall_*().
Post by Steve Grubb
that means auditing has been turned
on at some point in the past, and could be turned back on at some point in the
future.
And this will change nothing, afaics (wrt TIF_SYSCALL_AUDIT).

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Oleg Nesterov
2013-08-27 17:28:38 UTC
Permalink
Added the functions
task_ppid()
task_ppid_nr_ns()
task_ppid_nr_init_ns()
to safely abstract the lookup of the PPID
but it is not safe.
+static inline struct pid *task_ppid(struct task_struct *task)
+{
+ return task_tgid(rcu_dereference(current->real_parent));
^^^^^^^
task?
+static inline pid_t task_ppid_nr_ns(struct task_struct *tsk,
+ struct pid_namespace *ns)
+{
+ pid_t pid;
+
+ rcu_read_lock();
+ pid = pid_nr_ns(task_ppid(current), ns);
^^^^^^^
again.
+ rcu_read_unlock();
And why this is safe?

rcu_read_lock() can't help if tsk was already dead _before_ it takes
the rcu lock. ->real_parent can point the already freed/reused/unmapped
memory.

This is safe if, for example, the caller alredy holds rcu_read_lock()
and tsk was found by find_task_by*(), or tsk is current.

Richard, just in case... I am going to vacation, I will be completely
offline till Sep 10.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Richard Guy Briggs
2013-08-30 19:57:29 UTC
Permalink
Post by Oleg Nesterov
Added the functions
task_ppid()
task_ppid_nr_ns()
task_ppid_nr_init_ns()
to safely abstract the lookup of the PPID
but it is not safe.
+static inline struct pid *task_ppid(struct task_struct *task)
+{
+ return task_tgid(rcu_dereference(current->real_parent));
^^^^^^^
task?
Yup, thanks for those two catches.
Post by Oleg Nesterov
+ rcu_read_unlock();
And why this is safe?
rcu_read_lock() can't help if tsk was already dead _before_ it takes
the rcu lock. ->real_parent can point the already freed/reused/unmapped
memory.
Does it not bump a refcount if it is holding a pointer to it? So the
parent task might be dead, but it won't cause a pointer dereference
issue.
Post by Oleg Nesterov
This is safe if, for example, the caller alredy holds rcu_read_lock()
and tsk was found by find_task_by*(), or tsk is current.
Fair enough, I'll have a more careful look at this. Thanks.

Most of the instances are current, but the one called from apparmour is
stored. I've just learned that this is bad and someone else just chimed
in that they have a patch to remove it...

So what is a reliable way of keeping a reference to a task? I had
assumed that the best way was to keep a pointer to its task_struct,
making sure its refcount had been bumped by something like
get_task_struct(). Another way would be to do the same with its struct
pid. The third that I'm trying to avoid is using its init_pid_namespace
pid_t since it could refer to a completely different task if the pid_t
rolls over.
Post by Oleg Nesterov
Oleg.
- RGB

--
Richard Guy Briggs <***@redhat.com>
Senior Software Engineer
Kernel Security
AMER ENG Base Operating Systems
Remote, Ottawa, Canada
Voice: +1.647.777.2635
Internal: (81) 32635
Alt: +1.613.693.0684x3545
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
John Johansen
2013-08-30 20:37:25 UTC
Permalink
Post by Richard Guy Briggs
Post by Oleg Nesterov
Added the functions
task_ppid()
task_ppid_nr_ns()
task_ppid_nr_init_ns()
to safely abstract the lookup of the PPID
but it is not safe.
+static inline struct pid *task_ppid(struct task_struct *task)
+{
+ return task_tgid(rcu_dereference(current->real_parent));
^^^^^^^
task?
Yup, thanks for those two catches.
Post by Oleg Nesterov
+ rcu_read_unlock();
And why this is safe?
rcu_read_lock() can't help if tsk was already dead _before_ it takes
the rcu lock. ->real_parent can point the already freed/reused/unmapped
memory.
Does it not bump a refcount if it is holding a pointer to it? So the
parent task might be dead, but it won't cause a pointer dereference
issue.
Post by Oleg Nesterov
This is safe if, for example, the caller alredy holds rcu_read_lock()
and tsk was found by find_task_by*(), or tsk is current.
Fair enough, I'll have a more careful look at this. Thanks.
Most of the instances are current, but the one called from apparmour is
stored. I've just learned that this is bad and someone else just chimed
in that they have a patch to remove it...
the apparmor case isn't actually stored long term. The stored task will be
a parameter that was passed into an lsm hook and the buffer that it is
stored in dies before the hook is done. Its temporarily stored in the
struct so that it can be passed into the lsm_audit fn, and printed into an
allocated audit buffer. The text version in the audit buffer is what will
exist beyond the hook.

There are three patches, I'll reply them below once I have finished rebasing
them to apply to the current tree instead of my dev tree.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
John Johansen
2013-08-30 22:41:29 UTC
Permalink
Mediation is based off of the cred but auditing includes the current
task which may not be related to the actual request.

Signed-off-by: John Johansen <***@canonical.com>
---
security/apparmor/capability.c | 15 +++++----------
security/apparmor/domain.c | 2 +-
security/apparmor/include/capability.h | 5 ++---
security/apparmor/include/ipc.h | 4 ++--
security/apparmor/ipc.c | 9 ++++-----
security/apparmor/lsm.c | 2 +-
6 files changed, 15 insertions(+), 22 deletions(-)

diff --git a/security/apparmor/capability.c b/security/apparmor/capability.c
index 887a5e9..98a73eb 100644
--- a/security/apparmor/capability.c
+++ b/security/apparmor/capability.c
@@ -48,8 +48,7 @@ static void audit_cb(struct audit_buffer *ab, void *va)

/**
* audit_caps - audit a capability
- * @profile: profile confining task (NOT NULL)
- * @task: task capability test was performed against (NOT NULL)
+ * @profile: profile being tested for confinement (NOT NULL)
* @cap: capability tested
* @error: error code returned by test
*
@@ -58,8 +57,7 @@ static void audit_cb(struct audit_buffer *ab, void *va)
*
* Returns: 0 or sa->error on success, error code on failure
*/
-static int audit_caps(struct aa_profile *profile, struct task_struct *task,
- int cap, int error)
+static int audit_caps(struct aa_profile *profile, int cap, int error)
{
struct audit_cache *ent;
int type = AUDIT_APPARMOR_AUTO;
@@ -68,7 +66,6 @@ static int audit_caps(struct aa_profile *profile, struct task_struct *task,
sa.type = LSM_AUDIT_DATA_CAP;
sa.aad = &aad;
sa.u.cap = cap;
- sa.aad->tsk = task;
sa.aad->op = OP_CAPABLE;
sa.aad->error = error;

@@ -119,8 +116,7 @@ static int profile_capable(struct aa_profile *profile, int cap)

/**
* aa_capable - test permission to use capability
- * @task: task doing capability test against (NOT NULL)
- * @profile: profile confining @task (NOT NULL)
+ * @profile: profile being tested against (NOT NULL)
* @cap: capability to be tested
* @audit: whether an audit record should be generated
*
@@ -128,8 +124,7 @@ static int profile_capable(struct aa_profile *profile, int cap)
*
* Returns: 0 on success, or else an error code.
*/
-int aa_capable(struct task_struct *task, struct aa_profile *profile, int cap,
- int audit)
+int aa_capable(struct aa_profile *profile, int cap, int audit)
{
int error = profile_capable(profile, cap);

@@ -139,5 +134,5 @@ int aa_capable(struct task_struct *task, struct aa_profile *profile, int cap,
return error;
}

- return audit_caps(profile, task, cap, error);
+ return audit_caps(profile, cap, error);
}
diff --git a/security/apparmor/domain.c b/security/apparmor/domain.c
index 01b7bd6..f037c57 100644
--- a/security/apparmor/domain.c
+++ b/security/apparmor/domain.c
@@ -75,7 +75,7 @@ static int may_change_ptraced_domain(struct task_struct *task,
if (!tracer || unconfined(tracerp))
goto out;

- error = aa_may_ptrace(tracer, tracerp, to_profile, PTRACE_MODE_ATTACH);
+ error = aa_may_ptrace(tracerp, to_profile, PTRACE_MODE_ATTACH);

out:
rcu_read_unlock();
diff --git a/security/apparmor/include/capability.h b/security/apparmor/include/capability.h
index c24d295..e4fea19 100644
--- a/security/apparmor/include/capability.h
+++ b/security/apparmor/include/capability.h
@@ -4,7 +4,7 @@
* This file contains AppArmor capability mediation definitions.
*
* Copyright (C) 1998-2008 Novell/SUSE
- * Copyright 2009-2010 Canonical Ltd.
+ * Copyright 2009-2013 Canonical Ltd.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License as
@@ -34,8 +34,7 @@ struct aa_caps {
kernel_cap_t extended;
};

-int aa_capable(struct task_struct *task, struct aa_profile *profile, int cap,
- int audit);
+int aa_capable(struct aa_profile *profile, int cap, int audit);

static inline void aa_free_cap_rules(struct aa_caps *caps)
{
diff --git a/security/apparmor/include/ipc.h b/security/apparmor/include/ipc.h
index aeda0fb..288ca76 100644
--- a/security/apparmor/include/ipc.h
+++ b/security/apparmor/include/ipc.h
@@ -19,8 +19,8 @@

struct aa_profile;

-int aa_may_ptrace(struct task_struct *tracer_task, struct aa_profile *tracer,
- struct aa_profile *tracee, unsigned int mode);
+int aa_may_ptrace(struct aa_profile *tracer, struct aa_profile *tracee,
+ unsigned int mode);

int aa_ptrace(struct task_struct *tracer, struct task_struct *tracee,
unsigned int mode);
diff --git a/security/apparmor/ipc.c b/security/apparmor/ipc.c
index c51d226..777ac1c 100644
--- a/security/apparmor/ipc.c
+++ b/security/apparmor/ipc.c
@@ -54,15 +54,14 @@ static int aa_audit_ptrace(struct aa_profile *profile,

/**
* aa_may_ptrace - test if tracer task can trace the tracee
- * @tracer_task: task who will do the tracing (NOT NULL)
* @tracer: profile of the task doing the tracing (NOT NULL)
* @tracee: task to be traced
* @mode: whether PTRACE_MODE_READ || PTRACE_MODE_ATTACH
*
* Returns: %0 else error code if permission denied or error
*/
-int aa_may_ptrace(struct task_struct *tracer_task, struct aa_profile *tracer,
- struct aa_profile *tracee, unsigned int mode)
+int aa_may_ptrace(struct aa_profile *tracer, struct aa_profile *tracee,
+ unsigned int mode)
{
/* TODO: currently only based on capability, not extended ptrace
* rules,
@@ -72,7 +71,7 @@ int aa_may_ptrace(struct task_struct *tracer_task, struct aa_profile *tracer,
if (unconfined(tracer) || tracer == tracee)
return 0;
/* log this capability request */
- return aa_capable(tracer_task, tracer, CAP_SYS_PTRACE, 1);
+ return aa_capable(tracer, CAP_SYS_PTRACE, 1);
}

/**
@@ -101,7 +100,7 @@ int aa_ptrace(struct task_struct *tracer, struct task_struct *tracee,
if (!unconfined(tracer_p)) {
struct aa_profile *tracee_p = aa_get_task_profile(tracee);

- error = aa_may_ptrace(tracer, tracer_p, tracee_p, mode);
+ error = aa_may_ptrace(tracer_p, tracee_p, mode);
error = aa_audit_ptrace(tracer_p, tracee_p, error);

aa_put_profile(tracee_p);
diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c
index 2e2a0dd..69c54c8 100644
--- a/security/apparmor/lsm.c
+++ b/security/apparmor/lsm.c
@@ -145,7 +145,7 @@ static int apparmor_capable(const struct cred *cred, struct user_namespace *ns,
if (!error) {
profile = aa_cred_profile(cred);
if (!unconfined(profile))
- error = aa_capable(current, profile, cap, audit);
+ error = aa_capable(profile, cap, audit);
}
return error;
}
--
1.8.3.2


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
John Johansen
2013-08-30 22:42:36 UTC
Permalink
Now that aa_capabile no longer sets the task field it can be removed
and the lsm_audit version of the field can be used.

Signed-off-by: John Johansen <***@canonical.com>
---
security/apparmor/audit.c | 10 ++--------
1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/security/apparmor/audit.c b/security/apparmor/audit.c
index 031d2d9..e32c448 100644
--- a/security/apparmor/audit.c
+++ b/security/apparmor/audit.c
@@ -111,7 +111,7 @@ static const char *const aa_audit_type[] = {
static void audit_pre(struct audit_buffer *ab, void *ca)
{
struct common_audit_data *sa = ca;
- struct task_struct *tsk = sa->aad->tsk ? sa->aad->tsk : current;
+ struct task_struct *tsk = sa->u.tsk ? sa->u.tsk : current;

if (aa_g_audit_header) {
audit_log_format(ab, "apparmor=");
@@ -149,12 +149,6 @@ static void audit_pre(struct audit_buffer *ab, void *ca)
audit_log_format(ab, " name=");
audit_log_untrustedstring(ab, sa->aad->name);
}
-
- if (sa->aad->tsk) {
- audit_log_format(ab, " pid=%d comm=", tsk->pid);
- audit_log_untrustedstring(ab, tsk->comm);
- }
-
}

/**
@@ -212,7 +206,7 @@ int aa_audit(int type, struct aa_profile *profile, gfp_t gfp,

if (sa->aad->type == AUDIT_APPARMOR_KILL)
(void)send_sig_info(SIGKILL, NULL,
- sa->aad->tsk ? sa->aad->tsk : current);
+ sa->u.tsk ? sa->u.tsk : current);

if (sa->aad->type == AUDIT_APPARMOR_ALLOWED)
return complain_error(sa->aad->error);
--
1.8.3.2


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
John Johansen
2013-08-30 22:43:52 UTC
Permalink
The reporting of the parent task info is a vestage from old versions of
apparmor. The need for this information was removed by unique null-
profiles before apparmor was upstreamed so remove this info from logging.

Signed-off-by: John Johansen <***@canonical.com>
---
security/apparmor/audit.c | 6 ------
security/apparmor/include/audit.h | 1 -
2 files changed, 7 deletions(-)

diff --git a/security/apparmor/audit.c b/security/apparmor/audit.c
index e32c448..89c7865 100644
--- a/security/apparmor/audit.c
+++ b/security/apparmor/audit.c
@@ -111,7 +111,6 @@ static const char *const aa_audit_type[] = {
static void audit_pre(struct audit_buffer *ab, void *ca)
{
struct common_audit_data *sa = ca;
- struct task_struct *tsk = sa->u.tsk ? sa->u.tsk : current;

if (aa_g_audit_header) {
audit_log_format(ab, "apparmor=");
@@ -132,11 +131,6 @@ static void audit_pre(struct audit_buffer *ab, void *ca)

if (sa->aad->profile) {
struct aa_profile *profile = sa->aad->profile;
- pid_t pid;
- rcu_read_lock();
- pid = rcu_dereference(tsk->real_parent)->pid;
- rcu_read_unlock();
- audit_log_format(ab, " parent=%d", pid);
if (profile->ns != root_ns) {
audit_log_format(ab, " namespace=");
audit_log_untrustedstring(ab, profile->ns->base.hname);
diff --git a/security/apparmor/include/audit.h b/security/apparmor/include/audit.h
index 69d8cae..cf65443 100644
--- a/security/apparmor/include/audit.h
+++ b/security/apparmor/include/audit.h
@@ -110,7 +110,6 @@ struct apparmor_audit_data {
void *profile;
const char *name;
const char *info;
- struct task_struct *tsk;
union {
void *target;
struct {
--
1.8.3.2


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Richard Guy Briggs
2013-09-03 18:32:37 UTC
Permalink
Post by John Johansen
Post by Richard Guy Briggs
Most of the instances are current, but the one called from apparmour is
stored. I've just learned that this is bad and someone else just chimed
in that they have a patch to remove it...
the apparmor case isn't actually stored long term. The stored task will be
a parameter that was passed into an lsm hook and the buffer that it is
stored in dies before the hook is done. Its temporarily stored in the
struct so that it can be passed into the lsm_audit fn, and printed into an
allocated audit buffer. The text version in the audit buffer is what will
exist beyond the hook.
There are three patches, I'll reply them below once I have finished rebasing
them to apply to the current tree instead of my dev tree.
John, thanks for this context and fix. That helps simplify things.

- RGB

--
Richard Guy Briggs <***@redhat.com>
Senior Software Engineer
Kernel Security
AMER ENG Base Operating Systems
Remote, Ottawa, Canada
Voice: +1.647.777.2635
Internal: (81) 32635
Alt: +1.613.693.0684x3545
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Richard Guy Briggs
2013-12-11 14:48:39 UTC
Permalink
Post by Richard Guy Briggs
Post by John Johansen
Post by Richard Guy Briggs
Most of the instances are current, but the one called from apparmour is
stored. I've just learned that this is bad and someone else just chimed
in that they have a patch to remove it...
the apparmor case isn't actually stored long term. The stored task will be
a parameter that was passed into an lsm hook and the buffer that it is
stored in dies before the hook is done. Its temporarily stored in the
struct so that it can be passed into the lsm_audit fn, and printed into an
allocated audit buffer. The text version in the audit buffer is what will
exist beyond the hook.
There are three patches, I'll reply them below once I have finished rebasing
them to apply to the current tree instead of my dev tree.
John, thanks for this context and fix. That helps simplify things.
John, What's the status of this set of 3 patches? I don't see them
upstream.
Post by Richard Guy Briggs
- RGB
- RGB

--
Richard Guy Briggs <***@redhat.com>
Senior Software Engineer, Kernel Security, AMER ENG Base Operating Systems, Red Hat
Remote, Ottawa, Canada
Voice: +1.647.777.2635, Internal: (81) 32635, Alt: +1.613.693.0684x3545
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
John Johansen
2013-12-11 16:44:42 UTC
Permalink
Post by Richard Guy Briggs
Post by Richard Guy Briggs
Post by John Johansen
Post by Richard Guy Briggs
Most of the instances are current, but the one called from apparmour is
stored. I've just learned that this is bad and someone else just chimed
in that they have a patch to remove it...
the apparmor case isn't actually stored long term. The stored task will be
a parameter that was passed into an lsm hook and the buffer that it is
stored in dies before the hook is done. Its temporarily stored in the
struct so that it can be passed into the lsm_audit fn, and printed into an
allocated audit buffer. The text version in the audit buffer is what will
exist beyond the hook.
There are three patches, I'll reply them below once I have finished rebasing
them to apply to the current tree instead of my dev tree.
John, thanks for this context and fix. That helps simplify things.
John, What's the status of this set of 3 patches? I don't see them
upstream.
they are part of the security tree merge in 3.13

51775fe apparmor: remove the "task" arg from may_change_ptraced_domain()
4a7fc30 apparmor: remove parent task info from audit logging
61e3fb8 apparmor: remove tsk field from the apparmor_audit_struct
dd0c6e8 apparmor: fix capability to not use the current task, during reporting


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Richard Guy Briggs
2013-12-11 17:20:03 UTC
Permalink
Post by John Johansen
Post by Richard Guy Briggs
Post by Richard Guy Briggs
Post by John Johansen
Post by Richard Guy Briggs
Most of the instances are current, but the one called from apparmour is
stored. I've just learned that this is bad and someone else just chimed
in that they have a patch to remove it...
the apparmor case isn't actually stored long term. The stored task will be
a parameter that was passed into an lsm hook and the buffer that it is
stored in dies before the hook is done. Its temporarily stored in the
struct so that it can be passed into the lsm_audit fn, and printed into an
allocated audit buffer. The text version in the audit buffer is what will
exist beyond the hook.
There are three patches, I'll reply them below once I have finished rebasing
them to apply to the current tree instead of my dev tree.
John, thanks for this context and fix. That helps simplify things.
John, What's the status of this set of 3 patches? I don't see them
upstream.
they are part of the security tree merge in 3.13
51775fe apparmor: remove the "task" arg from may_change_ptraced_domain()
4a7fc30 apparmor: remove parent task info from audit logging
61e3fb8 apparmor: remove tsk field from the apparmor_audit_struct
dd0c6e8 apparmor: fix capability to not use the current task, during reporting
Ok, cool, so they will be upstream by the time I'll need them. Thanks!

- RGB

--
Richard Guy Briggs <***@redhat.com>
Senior Software Engineer, Kernel Security, AMER ENG Base Operating Systems, Red Hat
Remote, Ottawa, Canada
Voice: +1.647.777.2635, Internal: (81) 32635, Alt: +1.613.693.0684x3545
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Richard Guy Briggs
2013-12-23 22:29:03 UTC
Permalink
This patchset is a revival of some of Eric Biederman's work to make audit
pid-namespace-safe.

In a couple of places, audit was printing PIDs in the task's pid namespace
rather than relative to the audit daemon's pid namespace, which currently is
init_pid_ns.

It also allows processes to log audit user messages in their own pid
namespaces, which was not previously permitted. Please see:
https://bugzilla.redhat.com/show_bug.cgi?id=947530
https://bugs.launchpad.net/ubuntu/+source/vsftpd/+bug/1160372
https://bugzilla.novell.com/show_bug.cgi?id=786024

Part of the cleanup here involves deprecating task->pid and task->tgid, which
should be accessed using their respective helper functions.

See: https://lkml.org/lkml/2013/8/20/638

Richard Guy Briggs (5):
pid: get pid_t ppid of task in init_pid_ns
audit: convert PPIDs to the inital PID namespace.
audit: store audit_pid as a struct pid pointer
audit: anchor all pid references in the initial pid namespace
audit: allow user processes to log from another PID namespace

drivers/tty/tty_audit.c | 3 +-
include/linux/sched.h | 24 +++++++++++++++
kernel/audit.c | 54 ++++++++++++++++++++++++----------
kernel/audit.h | 4 +-
kernel/auditfilter.c | 17 ++++++++++-
kernel/auditsc.c | 24 ++++++++-------
security/apparmor/audit.c | 2 +-
security/integrity/integrity_audit.c | 2 +-
security/lsm_audit.c | 11 ++++--
9 files changed, 104 insertions(+), 37 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Oleg Nesterov
2013-12-30 17:04:17 UTC
Permalink
+static inline pid_t task_ppid_nr_ns(struct task_struct *tsk, struct pid_namespace *ns)
+{
+ pid_t pid;
+
+ rcu_read_lock();
+ pid = pid_alive(tsk) ?
+ task_pid_nr_ns(rcu_dereference(tsk->real_parent), ns) : 0;
+ rcu_read_unlock();
+
+ return pid;
+}
I do not really mind, but perhaps

pid_t pid = 0;

rcu_read_lock();
if (pid_alive(task))
pid = task_pid_nr_ns(rcu_dereference(tsk->real_parent);
rcu_read_unlock();

return pid;

looks a bit cleaner.
+static inline pid_t task_ppid_nr(struct task_struct *tsk)
+{
+ pid_t pid;
+
+ rcu_read_lock();
+ pid = pid_alive(tsk) ?
+ task_pid_nr(rcu_dereference(tsk->real_parent)) : 0;
+ rcu_read_unlock();
+
+ return pid;
+}
It could simply do

return task_ppid_nr_ns(tsk, init_pid_ns);

but again, I won't argue.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Oleg Nesterov
2013-12-30 17:07:24 UTC
Permalink
@@ -1839,10 +1839,10 @@ void audit_log_task_info(struct audit_buffer *ab, struct task_struct *tsk)
spin_unlock_irq(&tsk->sighand->siglock);
audit_log_format(ab,
- " ppid=%ld pid=%d auid=%u uid=%u gid=%u"
+ " ppid=%d pid=%d auid=%u uid=%u gid=%u"
" euid=%u suid=%u fsuid=%u"
" egid=%u sgid=%u fsgid=%u tty=%s ses=%u",
- sys_getppid(),
+ task_ppid_nr(tsk),
Hmm. But sys_getppid() returns tgid, not pid.

This probably means that 1/5 should use task_tgid_nr_*() ?

Note that ->real_parent is not necessarily the group leader.
@@ -459,7 +459,7 @@ static int audit_filter_rules(struct task_struct *tsk,
if (ctx) {
if (!ctx->ppid)
- ctx->ppid = sys_getppid();
+ ctx->ppid = task_ppid_nr(tsk);
The same.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Oleg Nesterov
2013-12-30 17:50:48 UTC
Permalink
- PID will be reported in the relevant querying PID namespace.
- Refuse to change the current audit_pid if the new value does not
point to an existing process.
- Refuse to set the current audit_pid if the new value is not its own PID
(unless it is being unset).
I started to read the changelog only after I failed to understand the
patch. And it looks confusing too.

"Refuse to set the current audit_pid if the new value is not its own PID",
OK, but if the new value is the caller's pid then it should obviously
point to the existing process, current?
- Convert audit_pid into the initial pid namespace for reports
I can't apply this patch (and the previous one) due to multiple rejects,
I guess it depends on other changes?

In any case, this patch looks wrong.
@@ -412,9 +412,11 @@ static void kauditd_send_skb(struct sk_buff *skb)
BUG_ON(err != -ECONNREFUSED); /* Shouldn't happen */
if (audit_pid) {
pr_err("audit: *NO* daemon at audit_pid=%d\n",
- audit_pid);
+ pid_nr(audit_pid));
audit_log_lost("auditd disappeared\n");
- audit_pid = 0;
+ put_pid(audit_pid);
But this can actually free this pid. Why it is safe to use it elsewhere?
+ audit_pid = NULL;
This also means that every "if (audit_pid)" is racy.
@@ -814,12 +816,26 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
return err;
}
if (s.mask & AUDIT_STATUS_PID) {
- int new_pid = s.pid;
-
- if ((!new_pid) && (task_tgid_vnr(current) != audit_pid))
+ struct pid *new_pid = find_get_pid(s.pid);
+ if (s.pid && !new_pid)
+ return -ESRCH;
can't understand... find_get_pid(s.pid) is pointless if s.pid == 0 ?

struct pid *new_pid = NULL;

if (new_pid) {
new_pid = find_get_pid(s.pid);
if (!new_pid)
return -ESRCH;
}

looks more understandable.
+ /* check that non-zero pid it is trying to set is its
+ * own*/
+ if (s.pid && (s.pid != task_pid_vnr(current)))
again, this looks a bit confusing and suboptimal. Imho it would be better
to add

if (new_pid != task_tgid(current))

into the "if (s.pid)" above. Hmm, actually task_pid_vnr() above looks
simply wrong, we need tgid or I am totally confused.

OTOH, if we require s.pid == task_pid_vnr(current), then why do we need
find_pid() at all?
+ return -EPERM;
this lacks put_pid(new_pid).
+ /* check that it isn't orphaning another process */
+ if ((!new_pid) &&
+ (task_tgid_vnr(current) != pid_vnr(audit_pid)))
return -EACCES;
and this can go into the "else" branch.

And I can't understand the "it isn't orphaning another process" logic...

OK, what if s.pid == 0 and audit_pid == NULL, should we fail in this case?

And I do not see how this matches "Refuse to set the current audit_pid
if the new value is not its own PID" from the changelog.
+
audit_pid = new_pid;
Another leak, it seems.
@@ -1331,7 +1347,8 @@ struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask,
return NULL;
if (gfp_mask & __GFP_WAIT) {
- if (audit_pid && audit_pid == current->pid)
+ if (pid_nr(audit_pid) == task_pid_nr(current))
So is this audit_pid tid or tgid? Its usage looks totally confusing.

Anyway,

if (audit_pid == task_pid(current))

(or probably task_tgid) looks much better.
+ //if (pid_task(audit_pid, PIDTYPE_PID) == current)
in this case you need to update Documentation/CodingStyle ;)

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Richard Guy Briggs
2014-01-21 23:38:20 UTC
Permalink
Post by Oleg Nesterov
- PID will be reported in the relevant querying PID namespace.
- Refuse to change the current audit_pid if the new value does not
point to an existing process.
- Refuse to set the current audit_pid if the new value is not its own PID
(unless it is being unset).
I started to read the changelog only after I failed to understand the
patch. And it looks confusing too.
"Refuse to set the current audit_pid if the new value is not its own PID",
OK, but if the new value is the caller's pid then it should obviously
point to the existing process, current?
Yes, but it may be lying, which helps nobody. The value should either
be its own pid, or zero.
Post by Oleg Nesterov
- Convert audit_pid into the initial pid namespace for reports
I can't apply this patch (and the previous one) due to multiple rejects,
I guess it depends on other changes?
It depends on other patches in my for-next tree, but not necessarily
functionally.
Post by Oleg Nesterov
In any case, this patch looks wrong.
@@ -412,9 +412,11 @@ static void kauditd_send_skb(struct sk_buff *skb)
BUG_ON(err != -ECONNREFUSED); /* Shouldn't happen */
if (audit_pid) {
pr_err("audit: *NO* daemon at audit_pid=%d\n",
- audit_pid);
+ pid_nr(audit_pid));
audit_log_lost("auditd disappeared\n");
- audit_pid = 0;
+ put_pid(audit_pid);
But this can actually free this pid. Why it is safe to use it elsewhere?
+ audit_pid = NULL;
This also means that every "if (audit_pid)" is racy.
Ok, so I really need something like:

if (audit_pid) {
struct pid *temp_pid = audit_pid;
pr_err("*NO* daemon at audit_pid=%d\n", pid_nr(audit_pid));
audit_log_lost("auditd disappeared\n");
audit_pid = NULL;
smp_mb();
put_pid(temp_pid);
audit_sock = NULL;
}

Do I actually need the memory barrier there? Is that the right one to
use?
Post by Oleg Nesterov
@@ -814,12 +816,26 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
return err;
}
if (s.mask & AUDIT_STATUS_PID) {
- int new_pid = s.pid;
-
- if ((!new_pid) && (task_tgid_vnr(current) != audit_pid))
+ struct pid *new_pid = find_get_pid(s.pid);
+ if (s.pid && !new_pid)
+ return -ESRCH;
can't understand... find_get_pid(s.pid) is pointless if s.pid == 0 ?
"can't understand"? I hope you mean "don't understand". ;-)
Post by Oleg Nesterov
struct pid *new_pid = NULL;
if (new_pid) {
I think you mean "if (s.pid) {".
Post by Oleg Nesterov
new_pid = find_get_pid(s.pid);
if (!new_pid)
return -ESRCH;
}
looks more understandable.
Agreed.
Post by Oleg Nesterov
+ /* check that non-zero pid it is trying to set is its
+ * own*/
+ if (s.pid && (s.pid != task_pid_vnr(current)))
again, this looks a bit confusing and suboptimal. Imho it would be better
to add
if (new_pid != task_tgid(current))
It is not so obvious to me, because this code had been dealing with
pid_t rather than struct pid*, but I agree it may be more optimal.
Post by Oleg Nesterov
into the "if (s.pid)" above. Hmm, actually task_pid_vnr() above looks
simply wrong, we need tgid or I am totally confused.
The use of task_tgid_vnr() in isolation in commit 34eab0a7 was an error,
I believe, and should have been task_pid_vnr() unless new_pid was then
converted to the thread group leader before being assigned to audit_pid.
I believe auditd only ever registers from the thread group leader, but I
will have to check that assumption with Steve and Eric. If not, I may
have to change it to reference the tgid, making this code a bit more
complex since it would need to convert from pid_t to struct pid*, then
find the tgid.

I'd need to replace:
new_pid = find_get_pid(s.pid);

with:
rcu_read_lock();
new_pid = get_pid(task_tgid(find_task_by_vpid(s.pid)))
rcu_read_unlock();

or it might be nicer to define:
struct pid *audit_find_get_tgid(pid_t vnr) {
rcu_read_lock();
new_pid = get_pid(task_tgid(find_task_by_vpid(vnr)));
rcu_read_unlock();
}

or better yet in kernel/pid.c:
struct pid *find_get_tgid(pid_t vnr) {
rcu_read_lock();
new_pid = get_pid(task_tgid(find_task_by_vpid(vnr)));
rcu_read_unlock();
}

Do you have another reason to believe we need task_tgid_vnr()?

Part of the answer is in 582edda5, but if my assumption above is
correct, we don't need it here.
Post by Oleg Nesterov
OTOH, if we require s.pid == task_pid_vnr(current), then why do we need
find_pid() at all?
Because userspace may be trying to set it to zero on shutdown, but I
suppose we could special-case zero.
Post by Oleg Nesterov
+ return -EPERM;
this lacks put_pid(new_pid).
Thanks.
Post by Oleg Nesterov
+ /* check that it isn't orphaning another process */
+ if ((!new_pid) &&
+ (task_tgid_vnr(current) != pid_vnr(audit_pid)))
return -EACCES;
and this can go into the "else" branch.
Yup.
Post by Oleg Nesterov
And I can't understand the "it isn't orphaning another process" logic...
There was a potential race condition that if a second auditd started up
while there was still one running, when the first eventually shut down,
it would try to set the audit_pid to zero to indicate it was going away
and orphan the newer auditd from kaudit.

If we check that the pid of the auditd trying to set it to zero was the
same as the pid recorded in kaudit, we can safely assume it is the
newest and safely set audit_pid to zero.

A newer auditd can orphan a previous auditd, but this isn't really
avoidable since it may be awol and need replacing anyways.
Post by Oleg Nesterov
OK, what if s.pid == 0 and audit_pid == NULL, should we fail in this case?
It doesn't really matter since auditd isn't likely to try this
combinaiton and it won't matter if it fails.
Post by Oleg Nesterov
And I do not see how this matches "Refuse to set the current audit_pid
if the new value is not its own PID" from the changelog.
+
audit_pid = new_pid;
Another leak, it seems.
Oops, thanks.
Post by Oleg Nesterov
@@ -1331,7 +1347,8 @@ struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask,
return NULL;
if (gfp_mask & __GFP_WAIT) {
- if (audit_pid && audit_pid == current->pid)
+ if (pid_nr(audit_pid) == task_pid_nr(current))
So is this audit_pid tid or tgid? Its usage looks totally confusing.
pid, as noted above, but could be changed to tgid.
Post by Oleg Nesterov
Anyway,
if (audit_pid == task_pid(current))
(or probably task_tgid) looks much better.
Fair enough, compare struct pid* rather than the original pid_t.
Post by Oleg Nesterov
+ //if (pid_task(audit_pid, PIDTYPE_PID) == current)
in this case you need to update Documentation/CodingStyle ;)
Heh... trust the SCM... and that was comparing task_structs rather
than pid_t or struct pid*. Still getting comfortable with struct pid*.
Post by Oleg Nesterov
Oleg.
- RGB

--
Richard Guy Briggs <***@redhat.com>
Senior Software Engineer, Kernel Security, AMER ENG Base Operating Systems, Red Hat
Remote, Ottawa, Canada
Voice: +1.647.777.2635, Internal: (81) 32635, Alt: +1.613.693.0684x3545
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Oleg Nesterov
2013-12-30 18:05:58 UTC
Permalink
Store and log all PIDs with reference to the initial PID namespace and
use the access functions task_pid_nr() and task_tgid_nr() for task->pid
and task->tgid rather than access them directly.
At first glance this patch looks like a good cleanup, but...
@@ -429,6 +429,19 @@ static struct audit_entry *audit_data_to_entry(struct audit_rule_data *data,
f->val = 0;
}
+ if ((f->type == AUDIT_PID) || (f->type == AUDIT_PPID)) {
+ struct pid *pid;
+ rcu_read_lock();
+ pid = find_vpid(f->val);
+ if (!pid) {
+ rcu_read_unlock();
+ err = -ESRCH;
+ goto exit_free;
+ }
+ f->val = pid_nr(pid);
+ rcu_read_unlock();
+ }
I do not really understand this change, but this doesn't matter, I do
not understand audit.

However, I think this deserves a separate patch with the changelog.
@@ -278,9 +278,12 @@ static void dump_common_audit_data(struct audit_buffer *ab,
}
tsk = a->u.tsk;
- if (tsk && tsk->pid) {
- audit_log_format(ab, " pid=%d comm=", tsk->pid);
- audit_log_untrustedstring(ab, tsk->comm);
+ if (tsk) {
+ pid_t pid = task_pid_nr(tsk);
+ if (pid) {
+ audit_log_format(ab, " pid=%d comm=", pid);
+ audit_log_untrustedstring(ab, tsk->comm);
Just curious, is it really possible that a->u.tsk is an idle thread?

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Richard Guy Briggs
2014-02-19 20:29:19 UTC
Permalink
Post by Oleg Nesterov
Store and log all PIDs with reference to the initial PID namespace and
use the access functions task_pid_nr() and task_tgid_nr() for task->pid
and task->tgid rather than access them directly.
At first glance this patch looks like a good cleanup, but...
@@ -429,6 +429,19 @@ static struct audit_entry *audit_data_to_entry(struct audit_rule_data *data,
f->val = 0;
}
+ if ((f->type == AUDIT_PID) || (f->type == AUDIT_PPID)) {
+ struct pid *pid;
+ rcu_read_lock();
+ pid = find_vpid(f->val);
+ if (!pid) {
+ rcu_read_unlock();
+ err = -ESRCH;
+ goto exit_free;
+ }
+ f->val = pid_nr(pid);
+ rcu_read_unlock();
+ }
I do not really understand this change, but this doesn't matter, I do
not understand audit.
Is this pid_t handed down from userspace a valid pid_t in its
namespace? If not, generate an error. If it is, store its global
pid_t for future comparisons. It might be better to store a struct
task_struct * or a struct pid *, but then the comparison functions would
have to change too, along with the rule reporting functions, which still
report pid_t to userspace.
Post by Oleg Nesterov
However, I think this deserves a separate patch with the changelog.
All *that* would need a seperate patch...
Post by Oleg Nesterov
@@ -278,9 +278,12 @@ static void dump_common_audit_data(struct audit_buffer *ab,
}
tsk = a->u.tsk;
- if (tsk && tsk->pid) {
- audit_log_format(ab, " pid=%d comm=", tsk->pid);
- audit_log_untrustedstring(ab, tsk->comm);
+ if (tsk) {
+ pid_t pid = task_pid_nr(tsk);
+ if (pid) {
+ audit_log_format(ab, " pid=%d comm=", pid);
+ audit_log_untrustedstring(ab, tsk->comm);
Just curious, is it really possible that a->u.tsk is an idle thread?
No. It is possible a->u.tsk isn't filled in.
Post by Oleg Nesterov
Oleg.
- RGB

--
Richard Guy Briggs <***@redhat.com>
Senior Software Engineer, Kernel Security, AMER ENG Base Operating Systems, Red Hat
Remote, Ottawa, Canada
Voice: +1.647.777.2635, Internal: (81) 32635, Alt: +1.613.693.0684x3545
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Richard Guy Briggs
2014-02-19 20:59:06 UTC
Permalink
This patchset is a revival of some of Eric Biederman's work to make audit
pid-namespace-safe.

In a couple of places, audit was printing PIDs in the task's pid namespace
rather than relative to the audit daemon's pid namespace, which currently is
init_pid_ns.

It also allows processes to log audit user messages in their own pid
namespaces, which was not previously permitted. Please see:
https://bugzilla.redhat.com/show_bug.cgi?id=947530
https://bugs.launchpad.net/ubuntu/+source/vsftpd/+bug/1160372
https://bugzilla.novell.com/show_bug.cgi?id=786024

Part of the cleanup here involves deprecating task->pid and task->tgid, which
should be accessed using their respective helper functions.

See: https://lkml.org/lkml/2013/8/20/638


Richard Guy Briggs (5):
pid: get pid_t ppid of task in init_pid_ns
audit: convert PPIDs to the inital PID namespace.
audit: store audit_pid as a struct pid pointer
audit: anchor all pid references in the initial pid namespace
audit: allow user processes to log from another PID namespace

drivers/tty/tty_audit.c | 3 +-
include/linux/sched.h | 18 ++++++++++
kernel/audit.c | 59 ++++++++++++++++++++++++----------
kernel/audit.h | 4 +-
kernel/auditfilter.c | 17 +++++++++-
kernel/auditsc.c | 24 +++++++------
security/integrity/integrity_audit.c | 2 +-
security/lsm_audit.c | 11 ++++--
8 files changed, 101 insertions(+), 37 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Loading...