Post by do1Post by Tetsuo HandaThere was a very long history of
battle between pathname based access control (e.g. AppArmor) and inode based
access control (e.g. SELinux). Since pathname based access control had
an advantage which cannot be achieved using inode based access control, TOMOYO
and AppArmor were able to join the mainline. (The advantage is not the ease of
use; see http://sourceforge.jp/projects/tomoyo/docs/lfj2008-bof.pdf and
http://sourceforge.jp/projects/tomoyo/docs/lca2009-kumaneko.pdf for examples.)
Great slides, thanks for them.
In lfj2008-bof.pdf you say, that to protect /etc/shadow being read if linked to /tmp/shadow
pathname based access control needs "to restrict pathname changes", but label based
don't need to care about it. So you saying yourself that it is possible to fully control
access to /etc/shadow if we also control renaming, linking, mounting.
I didn't say I can perfectly control access to /etc/shadow . I said we need to
care about not only readability/writability/executability of a file but also
the location of the file because whether the system works as expected depends
on whether resources are available at expected location. I said name based
access control can care about the location of resources better than inode based
access control. My opinion is that inode based access control and name based
access control play a complementary role vis-a-vis, and therefore both access
controls should be used together. LSM stacking is now under discussion and we
will be able to run both access controls in parallel by the end of this year.
Name based access control can control access to a file only if the file to
protect has uniquely identifiable pathnames. A temporary file created in /tmp/
may have random names like /tmp/w4gZ6h and multiple applications may create
such temporary files. But name based access control cannot distinguish which
temporary file should be accessible from which application because name has no
meaning in this case. Only inode based access control which associates creator
application's information can correctly distinguish such temporary files.
Not all files have uniquely identifiable names. Also, the same pathname in
different namespace may refer different resources. Inode based access control
sometimes handles better than name based access control.
Post by do1And all these things are controllable. So why you now saying perfectly protecting
using pathname based access control fails and I should understand and accept it.
Only partially controllable. Not always perfectly controllable.
As I said above, name based access control can protect resources only when
resources have limited and meaningful names. That's reason you can't perfectly
protect using pathname based access control.
Post by do1I don't understand, if you saying in pdf yourself that it is possible. It just requires
more work. Plus, you then prove that this more work is in fact what is may be
required and is a good thing. Which I agree. (For example, binding /etc/ to /tmp
- we need protection against this, this is pathname based feature, and this also
solves problem of /etc/shadow being pathname manipulated.) So why it fail or not
compatible with read-only directories? It looks compatible and your words support it.
Read-only mount/directory in one namespace can have read/write mount/directory
in another namespace. Pathname based access control can't do it because there
is no means to distinguish namespace.
Post by do1Post by Tetsuo HandaThis is impossible because of how pathnames are managed in Linux.
Three data structures involves here: "struct inode", "struct dentry" and
"struct vfsmount". A pathname is converted to a "struct vfsmount" and
"struct dentry" pair. "struct inode" can be determined via
"struct dentry"->d_inode and parent directory can be determined via
"struct dentry"->d_parent . But a "struct vfsmount" and "struct dentry" pair
which is needed for calculating a canonical pathname cannot be determined from
"struct inode". Therefore, we cannot calculate a pathname from "struct inode".
I know. But this is 'traverse' operation, so you possible have parts of
pathname (becasue it is supposed to traverse elements of pathname) not just
inode numbers. (How is Smack LSM enforce r or x on directory - so probably
LSM have some hooks into traverse mechanisms. I don't sure if it have parts
of pathname at that time, though, if not, then it is not possible of course.)
No partial pathname available for traverse operation. SELinux and SMACK do not
calculate pathnames for checking permissions. The hook which SELinux and SMACK
use for checking traverse permission receives only "struct inode".
int security_inode_permission(struct inode *inode, int mask);
Since "struct dentry" and "struct vfsmount" are not passed to
security_inode_permission() hook, pathname based access control (e.g. TOMOYO,
AppArmor, AKARI, CaitSith) cannot use pathname when checking permission for
traverse operation. Of course, you can join both LSM mailing list and FS-devel
mailing list and persuade the both maintainers to pass "struct dentry" and
"struct vfsmount"; good luck.
Post by do1In any case you maybe able to calculate directory realpath from dentry (just
cd to '..' until / is meet, I think that's how realpath is already works.)
Not true. Pathname based access control's realpath is calculated from a
"struct dentry" and "struct vfsmount" pair.
If we have only "struct dentry", we can calculate partial pathname only up to
mount point which the "struct dentry" belongs to; in order words, we cannot
calculate till / is met if "struct dentry" does not belong to / partition.
Post by do1Post by Tetsuo HandaOn the contrary, we have hooks for checking permissions (which TOMOYO,
AppArmor, AKARI, CaitSith uses) which are called after a pathname was converted
to a "struct vfsmount" and "struct dentry" pair. We can calculate a pathname
?rom these hooks but we cannot determine whether we have traversed a directory
inode which has path.ino=1234567 path.major=8 path.minor=2 attributes or not.
But Smack is LSM and it can restrict traverse operation. So there should be some hooks.
There is a hook for checking permission for traverse operation. But we cannot
calculate pathname from that hook. SMACK can work because SMACK does not use
pathnames for checking permissions, while your proposal can't be implemented
because your proposal needs to calculate pathnames from that hook.
Post by do1Post by Tetsuo HandaThe concept of canonical pathname does not work.
This exampel does not say anything about canonical pathname. It try to forbid
traverse by min/maj/inode.
There are multiple routes to reach the directory containing "yourbackup" file.
Try below operations. (Note the $ prompt which means non-root user.)
(1) $ mkdir -m 777 -p /tmp/dir1/dir2/dir3/
(2) $ echo hello > /tmp/dir1/dir2/dir3/file
(3) $ mkdir -m 777 -p /tmp/dir0/
(4) $ su - root -c "mount --bind /tmp/dir1/dir2/dir3/ /tmp/dir0/"
(5) $ cd /tmp/dir1/dir2/dir3/
(6) $ chmod 000 /tmp/dir1/
(7) $ cat /tmp/dir1/dir2/dir3/file
cat: /tmp/dir1/dir2/dir3/file: Permission denied
(8) $ cat file
hello
(9) $ chmod 777 /tmp/dir1/
(10) $ chmod 000 /tmp/dir1/dir2/
(11) $ cat /tmp/dir1/dir2/dir3/file
cat: /tmp/dir1/dir2/dir3/file: Permission denied
(12) $ cat file
hello
(13) $ chmod 777 /tmp/dir1/dir2/
(14) $ chmod 000 /tmp/dir1/dir2/dir3/
(15) $ cat /tmp/dir1/dir2/dir3/file
cat: /tmp/dir1/dir2/dir3/file: Permission denied
(16) $ cat file
cat: file: Permission denied
(17) $ chmod 777 /tmp/dir1/dir2/dir3
(18) $ chmod 000 /tmp/dir1/dir2/
(19) $ cat /tmp/dir1/dir2/dir3/file
cat: /tmp/dir1/dir2/dir3/file: Permission denied
(20) $ cat /tmp/dir0/file
hello
Suppose your min/maj/inode traversal checking idea is implemented and
min/maj/inode is set to attributes of /tmp/dir1/dir2/ upon the time of loading
rules, you will fail to block (20) because /tmp/dir0/file can reach
/tmp/dir1/dir2/dir3/file even though /tmp/dir1/dir2/ is not within /tmp/dir0/ .
Post by do1Post by Tetsuo HandaTrying to restrict based on grandparent directory's inode and/or its ascendant
inodes above does not work because only last component's inode and its parent
directory's inode are guaranteed to be checked, for a process might request
pathnames relative to current directory (e.g. unlink("yourbackup") rather than
unlink("/home/backup/year/month/day/yourbackup" when its current directory is
/home/backup/year/month/day/ ).
I understand that, but this does not mean this approach does not work at all.
For example, if I forbid for some min/maj/inode (directory) access altogether, then
user will be not able to chdir to any underlying pathname and then unlink("backupfile").
So this will work. (Plus, we still be possible able to determine realpath of parent directory.)
And comparing among (8), (12), (16), you can see that only (16) failed.
In other words, only permission of /tmp/dir1/dir2/dir3/ and
/tmp/dir1/dir2/dir3/file are checked if current directory is already
/tmp/dir1/dir2/dir3/ and requested access to ./file .
Trying to reject access using attributes of /tmp/dir1/dir2/ or /tmp/dir1/ does
not work.
You might think we can still block if we forbid changing current directory to
/tmp/dir1/dir2/dir3/ and doing bind mount operation. Yes if your system can
work even if you unconditionally forbid them. But we can't do;
security_inode_permission() cannot tell whether this is for chdir operation or
for other operations. If you unconditionally deny traverse operation at
security_inode_permission(), yourbackup becomes never reachable (i.e. even from
applications which should be able to access yourbackup).
Post by do1Post by Tetsuo HandaPeople can access protected data using relative pathnames. This means that,
a directory with "path.ino=1234567 path.major=8 path.minor=2" may not be
traversed when accessing a file which is located as a descendant of the
directory.
People will not be able to chdir behing that directory, so it works.
Post by Tetsuo HandaAlso, /home/ or /home/backup/year/month/day/ might be bind mounted to
somewhere else.
If you use pathnames in your rules, please understand and accept that
the rules are not compatible with read-only mounts/directories.
I think they are compatible, or at least step in that direction. We don't need to
achieve perfect security suddenly, but can go step by step and see if this helps.
Best regards,
I hope you have now understood why your min/maj/inode idea does not work.