Discussion:
Thinking outside the box on file systems
Marc Perkel
2007-08-14 22:45:05 UTC
Permalink
I want to throw out some concepts about a new way of
thinking about file systems. But the first thing you
have to do is to forget what you know about file
systems now. This is a discussion about a new view of
looking a file storage that is radically different and
it's more easily undersood if you forget a lot of what
you know. The idea is to create what seems natural to
the user rather than what seems natural to the
programmer.

For example, if a user has not read or write access to
a file then why should they be able to delete the file
- or even list the file in the directory? In order to
grasp this idea the idea of directory permission as
you now know them needs to go away.

Imagine that the file system is a database that
contains file data, name data, and permission data.
Loose the idea that files have an owner and a group or
the attributes that we are familiar with. Think
instead that users, groups, managers, application,
and such are objects and there is a complex rights
system that gives access to names that point to file
data.

For example. If you list a directory you only see the
files that you have some rights to and files where you
have no rights are invisible to you. If a file is read
only to you then you can't delete it either. Having
write access to a directory really means that you have
file create rights. You can also delete files that you
have write access to. You would also allocate
permissions to manage file rights like being able to
set the rights of inferior users.

The ACLs that were added to Linux were a step in the
right direction but very incomplete. What should be is
a complex permission system that would allow fine
grained permissions and inherentance masks to control
what permission are granted when someone moves new
files into a directory. Instead of just root and users
there would be mid level roles where users and objects
had management authority over parts of the system and
the roles can be defined in a very flexible way. For
example, rights might change during "business hours".

I want to throw these concepts out there to inspire a
new way of thinging and let Linux evolve into a more
natural kind of file system rather than staying ture
to it's ancient roots. Of course there would be an
emulation layer to keep existing apps happy but I
think that Linux will never be truly what it could be
unless it breaks away from the limitations of the
past.

Anyhow, I'm going to stop at this just to let these
ideas settle in. In my mind there's a lot more detail
but let's see where this goes.

Marc Perkel






Marc Perkel
Junk Email Filter dot com
http://www.junkemailfilter.com


____________________________________________________________________________________
Fussy? Opinionated? Impossible to please? Perfect. Join Yahoo!'s user panel and lay it on us. http://surveylink.yahoo.com/gmrs/yahoo_panel_invite.asp?a=7
alan
2007-08-14 22:51:14 UTC
Permalink
Post by Marc Perkel
For example. If you list a directory you only see the
files that you have some rights to and files where you
have no rights are invisible to you. If a file is read
only to you then you can't delete it either. Having
write access to a directory really means that you have
file create rights. You can also delete files that you
have write access to. You would also allocate
permissions to manage file rights like being able to
set the rights of inferior users.
Imagine the fun you will have trying to write a file name and being told
you cannot write it for some unknown reason. Unbeknownst to you, there is
a file there, but it is not owned by you, thus invisible.

Making a file system more user oriented would avoid little gotchas like
this. The reason it is "programmer oriented" is that those are the people
who have worked out why it works and why certain things are bad ideas.
--
Refrigerator Rule #1: If you don't remember when you bought it, Don't eat it.
Michael Tharp
2007-08-15 13:02:37 UTC
Permalink
Post by alan
Imagine the fun you will have trying to write a file name and being told
you cannot write it for some unknown reason. Unbeknownst to you, there
is a file there, but it is not owned by you, thus invisible.
This jumped out at me right away. In such a system, an attacker with
write permissions on a "sticky" directory like /tmp could probe for
others' files by attempting to create them and recording all cases where
permission was denied due to an existing, hidden file. But of course,
this was just an example of something a less UNIX-y permission scheme
could do, not a key part of such a design.

Personally, what I'd like to see is a better way of dealing with
propagation of ownership. Currently, in order to allow "collaboration"
directories where a directory tree is owned by a certain group and
anyone in that group can write and create files, one has to change the
system umask, use a magical bit on the collaboration directory to
propagate group ownership, and create a group for every user on the
system in order to keep their personal files safe with the new umask.
This seems highly flawed. I suggest that propagation of group ownership
should be the default mode, not a special one, and that the
group-writable permissions should also be propagated to new files and
directories. This way, the user's home directory would remain 0755,
while the collaboration directory could be 0775, without any changing of
umasks.

Of course, this would go against tradition, and cause some mayhem in the
logic responsible for magically determining permissions for new files,
but since we're talking about thinking outside of the box, I think
that's excusable :)

-- m. tharp
Lennart Sorensen
2007-08-15 13:30:21 UTC
Permalink
Post by Michael Tharp
This jumped out at me right away. In such a system, an attacker with
write permissions on a "sticky" directory like /tmp could probe for
others' files by attempting to create them and recording all cases where
permission was denied due to an existing, hidden file. But of course,
this was just an example of something a less UNIX-y permission scheme
could do, not a key part of such a design.
Personally, what I'd like to see is a better way of dealing with
propagation of ownership. Currently, in order to allow "collaboration"
directories where a directory tree is owned by a certain group and
anyone in that group can write and create files, one has to change the
system umask, use a magical bit on the collaboration directory to
propagate group ownership, and create a group for every user on the
system in order to keep their personal files safe with the new umask.
This seems highly flawed. I suggest that propagation of group ownership
should be the default mode, not a special one, and that the
group-writable permissions should also be propagated to new files and
directories. This way, the user's home directory would remain 0755,
while the collaboration directory could be 0775, without any changing of
umasks.
Of course, this would go against tradition, and cause some mayhem in the
logic responsible for magically determining permissions for new files,
but since we're talking about thinking outside of the box, I think
that's excusable :)
Posix ACLs seem to solve most group permissions issues and control of
permission propegation. It actually works quite well on Linux. I am
surprised if there aren't lots of people already using it.

Remember that existing software expects a unix style interface, since
they are unix programs. Anything you invent MUST be compatible with the
standard unix view of things, even if it offers additional options.

--
Len Sorensen
Kyle Moffett
2007-08-15 13:53:45 UTC
Permalink
Post by Lennart Sorensen
Post by Michael Tharp
Personally, what I'd like to see is a better way of dealing with
propagation of ownership. Currently, in order to allow
"collaboration" directories where a directory tree is owned by a
certain group and anyone in that group can write and create files,
one has to change the system umask, use a magical bit on the
collaboration directory to propagate group ownership, and create a
group for every user on the system in order to keep their personal
files safe with the new umask. This seems highly flawed. I suggest
that propagation of group ownership should be the default mode,
not a special one, and that the group-writable permissions should
also be propagated to new files and directories. This way, the
user's home directory would remain 0755, while the collaboration
directory could be 0775, without any changing of umasks.
Posix ACLs seem to solve most group permissions issues and control
of permission propegation. It actually works quite well on Linux.
I am surprised if there aren't lots of people already using it.
Going even further in this direction, the following POSIX ACL on the
directories will do what you want:

## Note: file owner and group are kmoffett
u::rw-
g::rw-
u:lsorens:rw-
u:mtharp:rw-
u:mperkel:rw-
g:randomcvsdudes:r-
default:u::rw-
default:g::rw-
default:u:lsorens
default:u:mtharp:rw-
default:u:mperkel:rw-
default:g:randomcvsdudes:r-

Basically any newly-created item in such a directory will get the
permissions described by the "default:" entries in the ACL, and
subdirectories will get a copy of said "default:" entries.

So yes, such functionality is nice; even more so because we already
have it. I think if you were really going to "extend" a UNIX
filesystem it would need to be in 2 directions:
(A) Handling disk failures by keeping multiple copies of
important files.
(B) Have version-control support
(C) Allowing distributed storage (also lazy synchronization and
offline modification support)

With some appropriate modifications and hooks, GIT actually comes
pretty close here. For larger files it needs to use a "list-of-4MB-
chunks" approach to minimize the computation overhead for committing
a randomly-modified file. The "index" of course would be directly
read and modified by vfs calls and via mapped memory. Merge handling
would need careful integration, preferably with allowing custom
default-merge-handlers per subtree. There would be lots more design
issues to work out, but it's something to think about

Cheers,
Kyle Moffett
Michael Tharp
2007-08-15 15:14:33 UTC
Permalink
Post by Kyle Moffett
Basically any newly-created item in such a directory will get the
permissions described by the "default:" entries in the ACL, and
subdirectories will get a copy of said "default:" entries.
This would work well, although I would give write permissions to a group
so the entire dir wouldn't need to be re-ACLed when a user is added. I
may give this a shot; I've been avoiding ACLs because they have always
sounded incomplete/not useful, but the inheritance aspect sounds rather
nice.
Post by Kyle Moffett
So yes, such functionality is nice; even more so because we already have
it. I think if you were really going to "extend" a UNIX filesystem it
(A) Handling disk failures by keeping multiple copies of important
files.
This is ZFS' bailiwick, no? I'd love to see the licensing issues
resolved, because if it can control level of redundancy on a
per-file/directory basis, I would be a very happy man.
Post by Kyle Moffett
(B) Have version-control support
This might be pushing it, but hey, we *are* talking about the future here.
Post by Kyle Moffett
(C) Allowing distributed storage (also lazy synchronization and
offline modification support)
I'd really love to see distributed storage not suck. Everything I've
seen requires myriad daemons and ugly configuration.
Post by Kyle Moffett
With some appropriate modifications and hooks, GIT actually comes pretty
close here. For larger files it needs to use a "list-of-4MB-chunks"
approach to minimize the computation overhead for committing a
randomly-modified file. The "index" of course would be directly read
and modified by vfs calls and via mapped memory. Merge handling would
need careful integration, preferably with allowing custom
default-merge-handlers per subtree. There would be lots more design
issues to work out, but it's something to think about
Now you're just being silly ;)
Post by Kyle Moffett
Cheers,
Kyle Moffett
-- m. tharp
Marc Perkel
2007-08-15 16:36:38 UTC
Permalink
Post by Kyle Moffett
Post by Kyle Moffett
Basically any newly-created item in such a
directory will get the
Post by Kyle Moffett
permissions described by the "default:" entries in
the ACL, and
Post by Kyle Moffett
subdirectories will get a copy of said "default:"
entries.
This would work well, although I would give write
permissions to a group
so the entire dir wouldn't need to be re-ACLed when
a user is added. I
may give this a shot; I've been avoiding ACLs
because they have always
sounded incomplete/not useful, but the inheritance
aspect sounds rather
nice.
Michael, my idea in this model is that there will be
no permissions stored in files. So you would not have
to re-ACL anything.

What I'm thinking is there would be a new permission
system that is completely different.

It might be something like this. I am loged in as
mperkel. I get all the rights of mperkel and all other
objects like groups or management lists that I am a
member of. Once the system has a full list of my
rights it starts to compare the file name I'm trying
to access to the rights I have by testing each section
of the name. So if the file is
/home/mperkel/files/myfile then the test would be:

/home/mperkel/files/myfile - nothing
/home/mperkel/files - nothing
/home/mperkel - match - mperkel granted tree
permission

Rights tests would be based on trees so if you hit a
tree permission they you can access anything in the
tree unless you have hit a deny in the branches. All
of this is based on the text strings in the file name
with the "/" separator for the tests.

The correct way of thinking of this is applying
permissions to name strings. Directories will become
artificial constructs. For example, one might grant
permissions for files:

/etc/*.conf - read only
/etc - deny

In this example the user would be able to read any
file in the /etc directory that ended in *.conf but no
other files. If the object listed the /etc directory
it would only show the *.conf files and no other file
would appear to exist.

The important point here is that directories don't
really exist. Imagine that every file has an internal
number that is linked to the blocks that contain that
file. Then there are file names that link to that
number directly. Then there is a permission system
that compares the name you are requesting to a
permission algorithm that determines what you are
allowed to do to the name that you are requesting.

For example, you want to list all file names /etc/*.
Each name is fetched and your permissions are compared
to each item and you get a list of names that you have
some permission to access. If you have no permission
to a name that exists then you don't see the name.
Thus, suppose you have this permission

/etc/pass* - deny

Then you will not only be denied access to the
/etc/passwd file, you wouldn't even be able to tell if
it exists.

The root user for compatibility would have permissions
to everything. It would be like a super manager.
Managers would be objects that have limited ability to
alter the permissions for other users or objects that
they manage.

I'm also thinking there would be a "kernel" user which
would be a level above the root user where the kernel
would have access to files that even the root user
can't see (unless debug modes are set) so that some
files can be system only or readable by root but
writable by kernel.


Marc Perkel
Junk Email Filter dot com
http://www.junkemailfilter.com


____________________________________________________________________________________
Park yourself in front of a world of choices in alternative vehicles. Visit the Yahoo! Auto Green Center.
http://autos.yahoo.com/green_center/
Kyle Moffett
2007-08-15 17:17:39 UTC
Permalink
One note before you read the rest of this:
The kinds of things you are discussing here are usually called
"MAC" or "Mandatory Access Control", and they are always implemented
on top of an LSM *after* ordinary "DAC" or "Discretionary Access
Control" (IE: file permissions) are applied. If your MAC rules are
good enough you could theoretically just chmod 6777 all the files on
the system and make them all root-owned, but why throw away mostly-
free security?
Post by Michael Tharp
Post by Kyle Moffett
Basically any newly-created item in such a directory will get the
permissions described by the "default:" entries in the ACL, and
subdirectories will get a copy of said "default:" entries.
This would work well, although I would give write permissions to a
group so the entire dir wouldn't need to be re-ACLed when a user
is added. I may give this a shot; I've been avoiding ACLs because
they have always
sounded incomplete/not useful, but the inheritance aspect sounds
rather nice.
Michael, my idea in this model is that there will be no permissions
stored in files. So you would not have to re-ACL anything.
What I'm thinking is there would be a new permission system that is
completely different.
It might be something like this. I am loged in as mperkel. I get
all the rights of mperkel and all other objects like groups or
management lists that I am a member of. Once the system has a full
list of my rights it starts to compare the file name I'm trying to
access to the rights I have by testing each section of the name. So
Big flashing "WARNING" sign number 1: "Once the system has a _full_
_list_ of..." A "full list of" anything does not scale at all. When
coming up with kernel ideas, think of the biggest possible box you
can imagine, then scale it up by 2-3 orders of magnitude. If it
still works then you're fine; otherwise....
/home/mperkel/files/myfile - nothing
/home/mperkel/files - nothing
/home/mperkel - match - mperkel granted tree permission
Rights tests would be based on trees so if you hit a tree
permission they you can access anything in the tree unless you have
hit a deny in the branches. All of this is based on the text
strings in the file name with the "/" separator for the tests.
Big flashing "WARNING" sign number 2: You are doing privileges based
on pathnames, which is a massive no-no. Please go see the huge
AppArmor/SELinux flame-war that occurred a month or so ago for all
the reasons why pathname-based security doesn't work (and
furthermore, doesn't scale). Hint: It has to do with 4 syscalls:
chroot(), mount(), clone(CLONE_NEWNS), and link()
The correct way of thinking of this is applying permissions to name
strings. Directories will become artificial constructs. For
/etc/*.conf - read only
/etc - deny
And so when both /etc/shadow and /tmp/
file_about_to_be_nuked_by_a_daemon point to the same block of data,
you now lose *ALL* of the benefits of that model.
In this example the user would be able to read any file in the /etc
directory that ended in *.conf but no other files. If the object
listed the /etc directory it would only show the *.conf files and
no other file
would appear to exist.
Big flashing "WARNING" sign number 3: This means putting some kind
of pattern matcher in the kernel. I don't even want to *THINK* about
how unholy of a hell that would be. The way SELinux does this is by
using regexes in userspace to set the appropriate *initial* labels on
all the files, and let the system's automatic type transitions take
care of the rest.
The important point here is that directories don't really exist.
Except they do, and without directories the performance of your
average filesystem is going to suck.
Imagine that every file has an internal number that is linked to
the blocks that contain that file. Then there are file names that
link to that number directly.
These are called "inodes" and "hardlinks". Every file and directory
is a hardlink to the inode (using the inode number) containing its
data. For directories the inode data blocks have more inode
references, and for files they have actual data blocks. This is the
model that UNIX operating systems have been using for years.
Then there is a permission system that compares the name you are
requesting to a permission algorithm that determines what you are
allowed to do to the name that you are requesting.
The "name" that you are requesting is a fundamentally bad security
concept. Better is the attributes of the actual *inode* that you are
requesting.
Then you will not only be denied access to the /etc/passwd file,
you wouldn't even be able to tell if it exists.
You could theoretically do this with SELinux now; there was even a
thread recently about somebody trying to add an LSM hook for readdir
(), so that he could hide entries from an "ls". On the other hand,
under SELinux right now such a file looks like this:

***@ares:~# ls -al /foo
dr-xr-xr-x 3 root root 4096 2007-08-15 13:14 .
dr-xr-xr-x 26 root root 4096 2007-08-15 13:14 ..
?--------- - - - - - file_with_no_selinux_perms

I can still tell that "file_with_no_selinux_perms" is actually a
directory, though, by looking at the hardlink count of /foo. Since
it's 3, I can count up the parent-dir's-link and our own link,
leaving one left which must be a child-dir's link.

Cheers,
Kyle Moffett
Marc Perkel
2007-08-15 17:30:19 UTC
Permalink
--- Kyle Moffett <***@mac.com> wrote:
.
Post by Marc Perkel
Post by Marc Perkel
The important point here is that directories don't
really exist.
Except they do, and without directories the
performance of your
average filesystem is going to suck.
Actually you would get a speed improvement. You hash
the full name and get the file number. You don't have
to break up the name into sections except for
evaluating name permissions.

The important concept here is that files and name
aren't stored by levels of directories. The name
points to the file number. Directory levels are
emulated based on name separation characters or any
other algorithm that you want to use.

One could create a file system and permission system
that gets rid of the concept of directories entirely
if one chooses to.

That's outside the box big time.


Marc Perkel
Junk Email Filter dot com
http://www.junkemailfilter.com


____________________________________________________________________________________
Shape Yahoo! in your own image. Join our Network Research Panel today! http://surveylink.yahoo.com/gmrs/yahoo_panel_invite.asp?a=7
Craig Ruff
2007-08-15 18:22:52 UTC
Permalink
Post by Marc Perkel
Post by Kyle Moffett
Except they do, and without directories the
performance of your average filesystem is going to suck.
Actually you would get a speed improvement. You hash
the full name and get the file number. You don't have
to break up the name into sections except for
evaluating name permissions.
The important concept here is that files and name
aren't stored by levels of directories. The name
points to the file number. Directory levels are
emulated based on name separation characters or any
other algorithm that you want to use.
One could create a file system and permission system
that gets rid of the concept of directories entirely
if one chooses to.
I would like to add support for Kyle's assertion.

The model described by Marc is exactly the method used by the current
version of the NCAR Mass Storage Service (MSS), which is data archive
of 4+ petabytes contained in 40+ million files. To the user's point
of view, it looks somewhat like a POSIX file system with both some
extensions and deficiencies. The MSS was designed in the mid-1980s,
in an era where the costs of the supercomputers (Cray-1s at that time)
were paramount. This lead to some MSS design decisions to minimize the
need for users to rerun jobs on the expensive supercomputer just because
they messed up their MSS file creation statements.

Files names are a maximum of 128 bytes, with a dynamically managed
directory structure indicated by '/' characters in the name. The file
name is hashed, and the hash table provides the internal file number (the
address in the Master File Directory (MFD)). Any parent directories
are created automatically by the system upon file creation, and are
automatically deleted if empty upon file deletion. Directories also
have a self pointer, and both files and directories are chained together
to allow the user to list (or otherwise manipulate) the contents of
a directory.

The biggest problem with this model is that to manipulate the a directory
itself, you have to simulate the operation on all of the files contained
within it. For example to rename a directory with 'n' descendants,
you must perform:

n+1 hash table removals
n+1 hash table insertions (with collision detection)
n+1 MFD record updates
1 directory chain removal
1 directory chain insertion

This is, needless to say, very painful when n is large. Since users
must use directory trees to efficiently manage their data holdings,
efficient directory manipulation is essential. Contrast this with
the number of operations required for a directory rename if files
do not record their complete pathname:

1 directory chain removal
1 directory chain insertion

Fortunately we are currently working to change from using a model like
Marc describes to one Kyle describes.
Marc Perkel
2007-08-15 20:35:08 UTC
Permalink
On Wed, Aug 15, 2007 at 10:30:19AM -0700, Marc
Post by Marc Perkel
Post by Kyle Moffett
Except they do, and without directories the
performance of your average filesystem is going
to suck.
Post by Marc Perkel
Actually you would get a speed improvement. You
hash
Post by Marc Perkel
the full name and get the file number. You don't
have
Post by Marc Perkel
to break up the name into sections except for
evaluating name permissions.
The important concept here is that files and name
aren't stored by levels of directories. The name
points to the file number. Directory levels are
emulated based on name separation characters or
any
Post by Marc Perkel
other algorithm that you want to use.
One could create a file system and permission
system
Post by Marc Perkel
that gets rid of the concept of directories
entirely
Post by Marc Perkel
if one chooses to.
I would like to add support for Kyle's assertion.
The model described by Marc is exactly the method
used by the current
version of the NCAR Mass Storage Service (MSS),
which is data archive
of 4+ petabytes contained in 40+ million files. To
the user's point
of view, it looks somewhat like a POSIX file system
with both some
extensions and deficiencies. The MSS was designed
in the mid-1980s,
in an era where the costs of the supercomputers
(Cray-1s at that time)
were paramount. This lead to some MSS design
decisions to minimize the
need for users to rerun jobs on the expensive
supercomputer just because
they messed up their MSS file creation statements.
Files names are a maximum of 128 bytes, with a
dynamically managed
directory structure indicated by '/' characters in
the name. The file
name is hashed, and the hash table provides the
internal file number (the
address in the Master File Directory (MFD)). Any
parent directories
are created automatically by the system upon file
creation, and are
automatically deleted if empty upon file deletion.
Directories also
have a self pointer, and both files and directories
are chained together
to allow the user to list (or otherwise manipulate)
the contents of
a directory.
The biggest problem with this model is that to
manipulate the a directory
itself, you have to simulate the operation on all of
the files contained
within it. For example to rename a directory with
'n' descendants,
n+1 hash table removals
n+1 hash table insertions (with collision
detection)
n+1 MFD record updates
1 directory chain removal
1 directory chain insertion
This is, needless to say, very painful when n is
large. Since users
must use directory trees to efficiently manage their
data holdings,
efficient directory manipulation is essential.
Contrast this with
the number of operations required for a directory
rename if files
1 directory chain removal
1 directory chain insertion
Fortunately we are currently working to change from
using a model like
Marc describes to one Kyle describes.
I am describing a kind of functionality and not tied
to the method that implements that functionality.
Perhaps a straight hash of the name isn't the best way
to implement it. Just because someone tried to do
something like what I'm suggesting years ago and it
didn't work doesn't mean that it can't be done. You
just have to come up with a better method.

Lets take this example. We are moving a million files
from one branch if a tree to another. Do we wait for a
million renames and hashes to occur? Of course not. So
what to we do? We continue to be innovative.

One must first adopt the attitude that anything can be
done - you just have to be persistent until you figure
out how.

In this case we could have a name translation layer so
if you want to do a move you change the translation
layer indicating that a move occurred. Thus access to
the new files get translated into the old name and
accessed until the files are rehashed.

Or - maybe there is some sort of tokenizer database
for the names in the directory sections and you can
just rename the section. Sort of a tree like database
of hashes data within hashes.

My point - you start with what you want to do and then
you figure out how to make it happen. I can't answer
all the details of how to make it happen but when I do
something I start with the idea that if this were done
right it would work this way and then I figure out
how.




Marc Perkel
Junk Email Filter dot com
http://www.junkemailfilter.com



____________________________________________________________________________________
Take the Internet to Go: Yahoo!Go puts the Internet in your pocket: mail, news, photos & more.
http://mobile.yahoo.com/go?refer=1GNXIC
Helge Hafting
2007-08-16 11:27:05 UTC
Permalink
Post by Marc Perkel
I am describing a kind of functionality and not tied
to the method that implements that functionality.
Perhaps a straight hash of the name isn't the best way
to implement it. Just because someone tried to do
something like what I'm suggesting years ago and it
didn't work doesn't mean that it can't be done. You
just have to come up with a better method.
Actually, you are the one who have to
come up with a better method. :-)
You implement it, you deploy on your server,
you work out the quirks,
you win the discussion and impress everybody.
Post by Marc Perkel
Lets take this example. We are moving a million files
from one branch if a tree to another. Do we wait for a
million renames and hashes to occur? Of course not. So
what to we do? We continue to be innovative.
One must first adopt the attitude that anything can be
done - you just have to be persistent until you figure
out how.
Feel free to be persistent and do the work & testing
needed. People have done so before, which is why
there are so many linux filesystems to choose from.

What you can't do though, is to start with an attitude
that "anything can be done" and then expect to change
the direction of kernel work before everything is
figured out. Experimental stuff stays in your private
tree till it is proven - for a reason.
Post by Marc Perkel
In this case we could have a name translation layer so
if you want to do a move you change the translation
layer indicating that a move occurred. Thus access to
the new files get translated into the old name and
accessed until the files are rehashed.
That name translation layer has a nasty failure mode.
Let's say I want to move all files in /usr/local/* into /usr/old-local/
The name translation layer makes the transition seem
instant, even though there are now 10 million files queued
for renaming. References to /usr/local now goes to
/usr/old-local/ thanks to translation.

Now the problem. I did the move in order to fill
/usr/local/ with new improved files. But all references
to /usr/local now goes to /usr/old-local/, so the new
files go there as well.

A translation layer _can_ work around this, by noting
that /usr/local is currently being translated, the
move queue has not finished processing, so further accesses
to /usr/local goes to /usr/local-tmp/ instead, to be moved
into /usr/local when that other batch of moves finishes.

In doing this, you set yourself up for lots of complexity
and lots of behind-the-scenes work. People will
wonder why the server is so busy when nobody is
doing much, and what if there is a power crash
or disk error on a busy server. It'd be an "interesting"
fsck job on a machine that went down while
processing several such deferred moves and translations. :-/
Post by Marc Perkel
My point - you start with what you want to do and then
you figure out how to make it happen. I can't answer
all the details of how to make it happen but when I do
something I start with the idea that if this were done
right it would work this way and then I figure out
how.
Sure - you don't have to prove that it will work now.
Working it out over time and then showing us is ok.

Helge Hafting
Marc Perkel
2007-08-15 16:02:41 UTC
Permalink
Post by Kyle Moffett
## Note: file owner and group are kmoffett
u::rw-
g::rw-
u:lsorens:rw-
u:mtharp:rw-
u:mperkel:rw-
g:randomcvsdudes:r-
default:u::rw-
default:g::rw-
default:u:lsorens
default:u:mtharp:rw-
default:u:mperkel:rw-
default:g:randomcvsdudes:r-
Kyle, thinking further outside the box, files would no
longer have owners or permissions. Nor would
directories. People, groups, managers, and other
objects with have permissions. One might tag a file
with the object that created it so you could implement
"self" rights which might be use to replace the
concept of /tmp directories.


Marc Perkel
Junk Email Filter dot com
http://www.junkemailfilter.com



____________________________________________________________________________________
Got a little couch potato?
Check out fun summer activities for kids.
http://search.yahoo.com/search?fr=oni_on_mail&p=summer+activities+for+kids&cs=bz
V***@vt.edu
2007-08-15 16:57:45 UTC
Permalink
Post by Marc Perkel
Kyle, thinking further outside the box, files would no
longer have owners or permissions. Nor would
directories. People, groups, managers, and other
objects with have permissions.
You gotta think *way* out of the box to come up with a system where a "file"
isn't an object that can have some sort of ACL or permissions on it.
Marc Perkel
2007-08-15 17:09:31 UTC
Permalink
Post by Marc Perkel
Post by Marc Perkel
Kyle, thinking further outside the box, files
would no
Post by Marc Perkel
longer have owners or permissions. Nor would
directories. People, groups, managers, and other
objects with have permissions.
You gotta think *way* out of the box to come up with
a system where a "file"
isn't an object that can have some sort of ACL or
permissions on it.
Yep - way outside the box - and thus the title of the
thread.

The idea is that people have permissions - not files.
By people I mean users, groups, managers, applications
etc. One might even specify that there are no
permission restrictions at all. Part of the process
would be that the kernel load what code it will use
for the permission system. It might even be a little
perl script you write.


Also - you aren't even giving permission to access
files. It's permission to access name patterns. One
could apply REGEX masks to names to determine
permissions. So if you have permission to the name you
have permission to the file.

Hard links would be multiple names pointing to the
same file. Simlinks would be name aliases.



Marc Perkel
Junk Email Filter dot com
http://www.junkemailfilter.com



____________________________________________________________________________________
Moody friends. Drama queens. Your life? Nope! - their life, your story. Play Sims Stories at Yahoo! Games.
http://sims.yahoo.com/
Kyle Moffett
2007-08-15 17:22:43 UTC
Permalink
The idea is that people have permissions - not files. By people I
mean users, groups, managers, applications
etc. One might even specify that there are no permission
restrictions at all. Part of the process would be that the kernel
load what code it will use for the permission system. It might even
be a little perl script you write.
Also - you aren't even giving permission to access files. It's
permission to access name patterns. One could apply REGEX masks to
names to determine permissions. So if you have permission to the
name you have permission to the file.
Please excuse me, I'm going to go stand over in the corner for a minute.

*hahahahahaa hahahahahaaa hahaa hoo hee snicker sniff*

*wanders back into the conversation*

Sorry about that, pardon me.

I suspect you will find it somewhat hard to convince *anybody* on
this list to put either a regex engine or a Perl interpreter into the
kernel. I doubt you could even get a simple shell-style pattern
matcher in. First of all, both of the former chew up enormous gobs
of stack space *AND* they're NP-complete. You just can't do such
matching even in polynomial time, let alone something that scales
appropriately for an OS kernel like, say, O(log(n)).

Cheers,
Kyle Moffett
Marc Perkel
2007-08-15 17:34:11 UTC
Permalink
Post by Marc Perkel
The idea is that people have permissions - not
files. By people I
Post by Marc Perkel
mean users, groups, managers, applications
etc. One might even specify that there are no
permission
Post by Marc Perkel
restrictions at all. Part of the process would be
that the kernel
Post by Marc Perkel
load what code it will use for the permission
system. It might even
Post by Marc Perkel
be a little perl script you write.
Also - you aren't even giving permission to access
files. It's
Post by Marc Perkel
permission to access name patterns. One could
apply REGEX masks to
Post by Marc Perkel
names to determine permissions. So if you have
permission to the
Post by Marc Perkel
name you have permission to the file.
Please excuse me, I'm going to go stand over in the
corner for a minute.
*hahahahahaa hahahahahaaa hahaa hoo hee snicker
sniff*
*wanders back into the conversation*
Sorry about that, pardon me.
I suspect you will find it somewhat hard to convince
*anybody* on
this list to put either a regex engine or a Perl
interpreter into the
kernel. I doubt you could even get a simple
shell-style pattern
matcher in. First of all, both of the former chew
up enormous gobs
of stack space *AND* they're NP-complete. You just
can't do such
matching even in polynomial time, let alone
something that scales
appropriately for an OS kernel like, say, O(log(n)).
Cheers,
Kyle Moffett
Keep in mind that this is about thinking outside the
box. Don't let new ideas scare you.

I'm not suggesting that the kernel contain perl. I'm
saying that you can let the kernel call a perl program
in user space to control part of the permission
system. There are examples of this in FUSE. What I'm
suggesting would be very FUSE friendly.


Marc Perkel
Junk Email Filter dot com
http://www.junkemailfilter.com



____________________________________________________________________________________
Got a little couch potato?
Check out fun summer activities for kids.
http://search.yahoo.com/search?fr=oni_on_mail&p=summer+activities+for+kids&cs=bz
Alan
2007-08-18 23:27:23 UTC
Permalink
Post by Marc Perkel
Keep in mind that this is about thinking outside the
box. Don't let new ideas scare you.
My cat thinks outside the box all the time. Cleaning it up is a real
pain.
Alan
2007-08-18 23:26:21 UTC
Permalink
Post by Kyle Moffett
The idea is that people have permissions - not files. By people I
mean users, groups, managers, applications
etc. One might even specify that there are no permission
restrictions at all. Part of the process would be that the kernel
load what code it will use for the permission system. It might even
be a little perl script you write.
Also - you aren't even giving permission to access files. It's
permission to access name patterns. One could apply REGEX masks to
names to determine permissions. So if you have permission to the
name you have permission to the file.
Please excuse me, I'm going to go stand over in the corner for a minute.
*hahahahahaa hahahahahaaa hahaa hoo hee snicker sniff*
*wanders back into the conversation*
Sorry about that, pardon me.
I suspect you will find it somewhat hard to convince *anybody* on
this list to put either a regex engine or a Perl interpreter into the
kernel. I doubt you could even get a simple shell-style pattern
matcher in. First of all, both of the former chew up enormous gobs
of stack space *AND* they're NP-complete. You just can't do such
matching even in polynomial time, let alone something that scales
appropriately for an OS kernel like, say, O(log(n)).
Already been done. Take a look at "AppArmor" aka "Immunix".
d***@lang.hm
2007-08-19 02:03:06 UTC
Permalink
Post by Alan
Post by Kyle Moffett
The idea is that people have permissions - not files. By people I
mean users, groups, managers, applications
etc. One might even specify that there are no permission
restrictions at all. Part of the process would be that the kernel
load what code it will use for the permission system. It might even
be a little perl script you write.
Also - you aren't even giving permission to access files. It's
permission to access name patterns. One could apply REGEX masks to
names to determine permissions. So if you have permission to the
name you have permission to the file.
Please excuse me, I'm going to go stand over in the corner for a minute.
*hahahahahaa hahahahahaaa hahaa hoo hee snicker sniff*
*wanders back into the conversation*
Sorry about that, pardon me.
I suspect you will find it somewhat hard to convince *anybody* on
this list to put either a regex engine or a Perl interpreter into the
kernel. I doubt you could even get a simple shell-style pattern
matcher in. First of all, both of the former chew up enormous gobs
of stack space *AND* they're NP-complete. You just can't do such
matching even in polynomial time, let alone something that scales
appropriately for an OS kernel like, say, O(log(n)).
Already been done. Take a look at "AppArmor" aka "Immunix".
don't forget the ACPI interpreter.

David Lang
Al Viro
2007-08-19 02:57:58 UTC
Permalink
Post by Kyle Moffett
Post by Alan
Post by Kyle Moffett
I suspect you will find it somewhat hard to convince *anybody* on
this list to put either a regex engine or a Perl interpreter into the
kernel. I doubt you could even get a simple shell-style pattern
matcher in. First of all, both of the former chew up enormous gobs
of stack space *AND* they're NP-complete.
Eh? regex via NFA is O(expression size * string length) time and
O(expression size) space. If you can show that regex matching is
NP-complete, you've got a good shot at Nevanlinna Prize...

Not that it made regex in kernel a good idea, but fair is fair -
unless you can show any mentioning of backrefs upthread...[1]
Post by Kyle Moffett
You just can't do such
Post by Alan
Post by Kyle Moffett
matching even in polynomial time, let alone something that scales
appropriately for an OS kernel like, say, O(log(n)).
Already been done. Take a look at "AppArmor" aka "Immunix".
don't forget the ACPI interpreter.
YAProof that bogons follow Boze statistics...
Oleg Verych
2007-09-01 23:20:49 UTC
Permalink
* Date: Sun, 19 Aug 2007 03:57:58 +0100
Post by Al Viro
Post by d***@lang.hm
don't forget the ACPI interpreter.
YAProof that bogons follow Boze statistics...
or bugons, then.

Why big minds didn't do rdev-like binary patching of the kernel image
with binary ACPI data? Getting such data in (any) userspace would be the
only thing in the install process of a modern distro, otherwise it's just
a boring decompressing with copying.
____

Lennart Sorensen
2007-08-15 19:20:17 UTC
Permalink
Post by Marc Perkel
Yep - way outside the box - and thus the title of the
thread.
The idea is that people have permissions - not files.
By people I mean users, groups, managers, applications
etc. One might even specify that there are no
permission restrictions at all. Part of the process
would be that the kernel load what code it will use
for the permission system. It might even be a little
perl script you write.
Also - you aren't even giving permission to access
files. It's permission to access name patterns. One
could apply REGEX masks to names to determine
permissions. So if you have permission to the name you
have permission to the file.
So if I have permission to access /foo/*x but no permission to access
/foo/*y, do I have permission to rename /foo/123x to /foo/123y and if I
do so, do I loose access to my file? Can I move it back?
Post by Marc Perkel
Hard links would be multiple names pointing to the
same file. Simlinks would be name aliases.
I think I prefer to keep my files inside the box. That way I won't need
to get a bucket. :)

--
Len Sorensen
H. Peter Anvin
2007-08-16 23:12:02 UTC
Permalink
Post by Marc Perkel
Yep - way outside the box - and thus the title of the
thread.
The idea is that people have permissions - not files.
By people I mean users, groups, managers, applications
etc. One might even specify that there are no
permission restrictions at all. Part of the process
would be that the kernel load what code it will use
for the permission system. It might even be a little
perl script you write.
This isn't anything new. It is, in fact, described in many places.

Permissions can, most generally, be described as a matrix of objects and
security domains. This matrix is large and, generally, highly regular.
If we slice the matrix up and associate each column with an object, we
call it an "access control list". If we slice the matrix up and
associate each row with a security domain, we call it a "capability."

These can be, and often are, daisy-chained, so that an access control
list can contain "all possessors of capability X", for example.

Groups in Unix are, in fact, a form of capabilities.

-hpa
Kyle Moffett
2007-08-15 16:58:39 UTC
Permalink
Kyle, thinking further outside the box, files would no longer have
owners or permissions. Nor would
directories. People, groups, managers, and other objects with have
permissions. One might tag a file with the object that created it
so you could implement "self" rights which might be use to replace
the concept of /tmp directories.
Well, that's actually kind of close to how SELinux works.

This is the real fundamental design gotcha:
Our current apps *AND* admins speak "UNIX" and "POSIX". They
don't speak "MarcPerkelOS" (or even "SELinux"). As long as there is
not a reasonably-close-to-1-to-1 mapping between UNIX semantics and
your "outside the box" semantics, the latter can't really be used.
It would just involve rewriting too much code *AND* retraining too
many admins from scratch to make it work. Hell, even Windows and Mac
have moved towards a UNIX-like permissions system, precisely because
it's a simple model which is relatively easy to teach people how to
use. ACLs are just a slight modification of that model to allow two
things:
(A) Additional user/group permissions
(B) Default permissions for new child files/dirs/etc

People are having a huge problem with SELinux permissions as is, and
portions of that are a fairly standard model that's been worked over
in various OSes for many years. I seriously doubt that anything that
far "outside the box" is going to be feasible, at least in the near
term.

Good new filesystem developments are likely to be ones which preserve
the same outer model, yet allow for deeper/more-powerful control for
those users/admins who need it.

Cheers,
Kyle Moffett
Marc Perkel
2007-08-15 17:19:16 UTC
Permalink
Post by Marc Perkel
Kyle, thinking further outside the box, files
would no longer have
Post by Marc Perkel
owners or permissions. Nor would
directories. People, groups, managers, and other
objects with have
Post by Marc Perkel
permissions. One might tag a file with the object
that created it
Post by Marc Perkel
so you could implement "self" rights which might
be use to replace
Post by Marc Perkel
the concept of /tmp directories.
Well, that's actually kind of close to how SELinux
works.
Our current apps *AND* admins speak "UNIX" and
"POSIX". They
don't speak "MarcPerkelOS" (or even "SELinux"). As
long as there is
not a reasonably-close-to-1-to-1 mapping between
UNIX semantics and
your "outside the box" semantics, the latter can't
really be used.
It would just involve rewriting too much code *AND*
retraining too
many admins from scratch to make it work. Hell,
even Windows and Mac
have moved towards a UNIX-like permissions system,
precisely because
it's a simple model which is relatively easy to
teach people how to
use. ACLs are just a slight modification of that
model to allow two
(A) Additional user/group permissions
(B) Default permissions for new child
files/dirs/etc
People are having a huge problem with SELinux
permissions as is, and
portions of that are a fairly standard model that's
been worked over
in various OSes for many years. I seriously doubt
that anything that
far "outside the box" is going to be feasible, at
least in the near
term.
Good new filesystem developments are likely to be
ones which preserve
the same outer model, yet allow for
deeper/more-powerful control for
those users/admins who need it.
Cheers,
Kyle Moffett
Kyle, What I'm suggesting is scrapping all existing
concepts and replacing them with something entirely
new. Posix, Unix, SELinux go away except for an
emulation layer for backwards compatibility. What I'm
suggesting is to start over and do it right.

If this new idea is implemented then one could
implement POSIX as one of many permission modules that
one could load. One could also load a WINDOWS
permission model that could be used with SAMBA. This
would be a new more powerful underlying layer that can
be used to emulate anything you want. And it would be
great for people using FUSE who could make file
systems look any way they want.

One of the problems with the Unix/Linux world is that
your minds are locked into this one model. In order to
do it right it requires the mental discipline to break
out of that.


Marc Perkel
Junk Email Filter dot com
http://www.junkemailfilter.com



____________________________________________________________________________________
Looking for a deal? Find great prices on flights and hotels with Yahoo! FareChase.
http://farechase.yahoo.com/
Kyle Moffett
2007-08-15 17:37:29 UTC
Permalink
One of the problems with the Unix/Linux world is that your minds
are locked into this one model. In order to do it right it requires
the mental discipline to break out of that.
The major thing that you are missing is that this "one model" has
been very heavily tested over the years. People understand it, know
how to use it and write software for it, and grok its limitations.
There's also a vast amount of *existing* code that you can't just
"deprecate" overnight; the world just doesn't work that way. The
real way to get there (IE: a new model) from here (IE: the old model)
is the way all Linux development is done with a lot of sensible easy-
to-understand changes and refactorings.

With that said, if you actually want to sit down and start writing
*code* for your model, go ahead. If it turns out to be better than
our existing model then I owe you a bottle of your favorite beverage.

Cheers,
Kyle Moffett
Marc Perkel
2007-08-15 17:59:12 UTC
Permalink
Post by Marc Perkel
One of the problems with the Unix/Linux world is
that your minds
Post by Marc Perkel
are locked into this one model. In order to do it
right it requires
Post by Marc Perkel
the mental discipline to break out of that.
The major thing that you are missing is that this
"one model" has
been very heavily tested over the years. People
understand it, know
how to use it and write software for it, and grok
its limitations.
There's also a vast amount of *existing* code that
you can't just
"deprecate" overnight; the world just doesn't work
that way. The
real way to get there (IE: a new model) from here
(IE: the old model)
is the way all Linux development is done with a lot
of sensible easy-
to-understand changes and refactorings.
With that said, if you actually want to sit down and
start writing
*code* for your model, go ahead. If it turns out to
be better than
our existing model then I owe you a bottle of your
favorite beverage.
Cheers,
Kyle Moffett
When one thinks outside the box one has to think about
evolving beyond what you are used to. When I moved
beyond DOS I have to give up the idea of 8.3 file
names. The idea here is to come up with a model that
can emulate the existing system for backwards
compatibility.

The concept behind my model is to create a new layer
where you can do ANYTHING with file names and
permissions and create models that emulate Linux, DOS,
Windows, Mac, or anything else you can dream of. Then
you can create a Linux/Windows/Mac template to emulate
what you are used to.


Marc Perkel
Junk Email Filter dot com
http://www.junkemailfilter.com



____________________________________________________________________________________
Boardwalk for $500? In 2007? Ha! Play Monopoly Here and Now (it's updated for today's economy) at Yahoo! Games.
http://get.games.yahoo.com/proddesc?gamekey=monopolyherenow
Lennart Sorensen
2007-08-15 19:26:07 UTC
Permalink
Post by Marc Perkel
When one thinks outside the box one has to think about
evolving beyond what you are used to. When I moved
beyond DOS I have to give up the idea of 8.3 file
names. The idea here is to come up with a model that
can emulate the existing system for backwards
compatibility.
But moving beyond 8.3 didn't prevent you from still using 8.3 names if
you wanted too. Longer file names are just an extension of shorter
ones.
Post by Marc Perkel
The concept behind my model is to create a new layer
where you can do ANYTHING with file names and
permissions and create models that emulate Linux, DOS,
Windows, Mac, or anything else you can dream of. Then
you can create a Linux/Windows/Mac template to emulate
what you are used to.
I am not even sure your idea could represent everything that is
currently possible with Posix ACLs and the standard unix permisions,
never mind SELinux. I am also almost entirely convinced that your idea
would be amazingly slow and inefficient and a serious pain to use. And
of course nothing seems to ever suceed without backwards compatibility
to support legacy programs until new ones take over, unless it is in
fact an entirely new concept for a new field and isn't trying to replace
some existing working system.

--
Len Sorensen
Kyle Moffett
2007-08-15 20:11:44 UTC
Permalink
Post by Lennart Sorensen
When one thinks outside the box one has to think about evolving
beyond what you are used to. When I moved
beyond DOS I have to give up the idea of 8.3 file names. The idea
here is to come up with a model that can emulate the existing
system for backwards compatibility.
But moving beyond 8.3 didn't prevent you from still using 8.3 names
if you wanted too. Longer file names are just an extension of
shorter ones.
As another example, take a look at "git", the SCM we use for the
kernel, as contrasted with the older CVS. You can import your
complete CVS history into it without data loss, and then you can even
continue to use it the exact same way you used to use CVS, with some
slight differences in command-line syntax. Once you are ready to
move further, though, you can create multiple local branches to have
your co-workers pull from to test changes. You discover that merging
branches is much easier in git than in CVS. Your company starts to
use a more distributed development model, they implement a policy
telling developers to break up their changes into smaller pieces and
write better change-logs. Somebody suddenly discovers the ability to
"sign" a particular release version with a private key, and you start
doing that as part of your release management to ensure that the
codebase marked with a client tag is the exact same one you actually
shipped to that client.

On a fundamental level, GIT is a completely different paradigm from
CVS. Its internal operations are entirely differently organized, it
uses different algorithms and different storage formats. The end
result of that is that it's literally orders of magnitude faster on
large codebases. But to the USER it can be used exactly the same;
you could even write a little CVS-to-GIT wrapper which imported your
CVS into a git repo and then let you operate on it using "gcvs"
commands the same way you would have operated on real CVS repositories.

Just some food for thought

Cheers,
Kyle Moffett
Marc Perkel
2007-08-15 20:44:50 UTC
Permalink
On Aug 15, 2007, at 15:26:07, Lennart Sorensen
On Wed, Aug 15, 2007 at 10:59:12AM -0700, Marc
Post by Marc Perkel
When one thinks outside the box one has to think
about evolving
Post by Marc Perkel
beyond what you are used to. When I moved
beyond DOS I have to give up the idea of 8.3 file
names. The idea
Post by Marc Perkel
here is to come up with a model that can emulate
the existing
Post by Marc Perkel
system for backwards compatibility.
But moving beyond 8.3 didn't prevent you from
still using 8.3 names
if you wanted too. Longer file names are just an
extension of
shorter ones.
As another example, take a look at "git", the SCM we
use for the
kernel, as contrasted with the older CVS. You can
import your
complete CVS history into it without data loss, and
then you can even
continue to use it the exact same way you used to
use CVS, with some
slight differences in command-line syntax. Once you
are ready to
move further, though, you can create multiple local
branches to have
your co-workers pull from to test changes. You
discover that merging
branches is much easier in git than in CVS. Your
company starts to
use a more distributed development model, they
implement a policy
telling developers to break up their changes into
smaller pieces and
write better change-logs. Somebody suddenly
discovers the ability to
"sign" a particular release version with a private
key, and you start
doing that as part of your release management to
ensure that the
codebase marked with a client tag is the exact same
one you actually
shipped to that client.
On a fundamental level, GIT is a completely
different paradigm from
CVS. Its internal operations are entirely
differently organized, it
uses different algorithms and different storage
formats. The end
result of that is that it's literally orders of
magnitude faster on
large codebases. But to the USER it can be used
exactly the same;
you could even write a little CVS-to-GIT wrapper
which imported your
CVS into a git repo and then let you operate on it
using "gcvs"
commands the same way you would have operated on
real CVS repositories.
Just some food for thought
Cheers,
Kyle Moffett
Yes - that's a good example. Git is far more powerful
and a different paradigm for CVS. Someone had to think
outside the box and come up with a new way of looking
at things. I'm trying to do something like that with
this idea.

To me it make more sense to get rid of file
permissions and look at people permissions. It reminds
me of a story a friend of mine told about her 4 year
old son.

The story was that they were driving down the road
when they saw a wheel come off a truck. The son said,
"look mommy, that wheel lost it's truck."

To me files are like the wheel. Rather than having the
file know all it's owners it makes more sense for the
owners to know it's files.


Marc Perkel
Junk Email Filter dot com
http://www.junkemailfilter.com



____________________________________________________________________________________
Got a little couch potato?
Check out fun summer activities for kids.
http://search.yahoo.com/search?fr=oni_on_mail&p=summer+activities+for+kids&cs=bz
Lennart Sorensen
2007-08-15 21:04:00 UTC
Permalink
Post by Marc Perkel
Yes - that's a good example. Git is far more powerful
and a different paradigm for CVS. Someone had to think
outside the box and come up with a new way of looking
at things. I'm trying to do something like that with
this idea.
To me it make more sense to get rid of file
permissions and look at people permissions. It reminds
me of a story a friend of mine told about her 4 year
old son.
The story was that they were driving down the road
when they saw a wheel come off a truck. The son said,
"look mommy, that wheel lost it's truck."
To me files are like the wheel. Rather than having the
file know all it's owners it makes more sense for the
owners to know it's files.
Except for the fact that on most systems each owner may have millions of
files, while very rarely does a file have more than a few dozen owners
or groups. Having to wade through the permissions of millions of things
seems like a lot more work than checking a few dozen things.

Also the thing is that I care who can access files, while I do not
really care what particular set of files a user has access to. It is
the data I am protecting, not the user.

--
Len Sorensen
Helge Hafting
2007-08-16 11:42:16 UTC
Permalink
Post by Marc Perkel
Kyle, What I'm suggesting is scrapping all existing
concepts and replacing them with something entirely
new. Posix, Unix, SELinux go away except for an
emulation layer for backwards compatibility. What I'm
suggesting is to start over and do it right.
If you want to get any support for "starting over",
then you need to:
1. Point out some serious problem with the existing stuff,
otherwise why _bother_ start over
2. Come up with a truly better idea (demonstrably better)
that isn't full of so obvious flaws that a seasoned kernel
developer can shoot it down in 5 minutes.

Trying to be a visionary with a "great idea" that you are
prepared to let others implement just don't work on
this list. If you want to go that route, you
start a company, hire programmers, tell them to
implement your vision. If your idea is good then
your company succeeds.

If you want to be an open-source visionary, you have to
do the initial work yourself until you attract other interested people.
Post by Marc Perkel
One of the problems with the Unix/Linux world is that
your minds are locked into this one model. In order to
do it right it requires the mental discipline to break
out of that.
Or perhaps unix have the best model already? ;-)
If you want a big break with the existing unix models, then
perhaps a entirely new project is in order, rather than
trying to change linux. Linux is after all, in use by millions
who are satisfied with the linux filesystem model already.

Now, linux is open-source, so you can of course use it as a
starting point for your different system. Then you can compete
with "standard linux" - see who attracts most developers and
most users in the long run.

Helge Hafting
linux-os (Dick Johnson)
2007-08-16 12:09:25 UTC
Permalink
Post by Helge Hafting
Post by Marc Perkel
Kyle, What I'm suggesting is scrapping all existing
concepts and replacing them with something entirely
new. Posix, Unix, SELinux go away except for an
emulation layer for backwards compatibility. What I'm
suggesting is to start over and do it right.
If you want to get any support for "starting over",
1. Point out some serious problem with the existing stuff,
otherwise why _bother_ start over
2. Come up with a truly better idea (demonstrably better)
that isn't full of so obvious flaws that a seasoned kernel
developer can shoot it down in 5 minutes.
Trying to be a visionary with a "great idea" that you are
prepared to let others implement just don't work on
this list. If you want to go that route, you
start a company, hire programmers, tell them to
implement your vision. If your idea is good then
your company succeeds.
If you want to be an open-source visionary, you have to
do the initial work yourself until you attract other interested people.
Post by Marc Perkel
One of the problems with the Unix/Linux world is that
your minds are locked into this one model. In order to
do it right it requires the mental discipline to break
out of that.
Or perhaps unix have the best model already? ;-)
If you want a big break with the existing unix models, then
perhaps a entirely new project is in order, rather than
trying to change linux. Linux is after all, in use by millions
who are satisfied with the linux filesystem model already.
Now, linux is open-source, so you can of course use it as a
starting point for your different system. Then you can compete
with "standard linux" - see who attracts most developers and
most users in the long run.
Helge Hafting
-
I think Marc Perkel's basic idea was implimented under CP/M as
a "user level." Also, the concept of a virtual directory structure
has been around about as long as "container files," such as those
on the old Intel MDS200 development station. This is hardly
"thinking out-of-the-box." It's just a rehash of some abandoned
stuff.

That said, a major speed limitation of directory lookups
is the manipulation and comparison of variable-length strings.
Maybe Mark might be able to improve this by truly thinking
out-of-the-box.

Cheers,
Dick Johnson
Penguin : Linux version 2.6.22.1 on an i686 machine (5588.29 BogoMips).
My book : http://www.AbominableFirebug.com/
_


****************************************************************
The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to ***@analogic.com - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.
Phillip Susi
2007-08-15 17:34:44 UTC
Permalink
Post by Kyle Moffett
Going even further in this direction, the following POSIX ACL on the
## Note: file owner and group are kmoffett
u::rw-
g::rw-
u:lsorens:rw-
u:mtharp:rw-
u:mperkel:rw-
g:randomcvsdudes:r-
default:u::rw-
default:g::rw-
default:u:lsorens
default:u:mtharp:rw-
default:u:mperkel:rw-
default:g:randomcvsdudes:r-
The problem that I have with this setup is that it specifies an ACL on
EACH file. Yes, you can set a default on the directory for newly
created files, but what if I want to add a user to the access list for
that whole directory? I have to individually update every acl on every
file in that directory. Also if you move a file created elsewhere into
that directory, it retains its existing permissions doesn't it? I would
rather just add a new ace to the directory itself which specifies that
it applies to the entire tree. Then you only need to store a single acl
on disk, and only have to update one acl to add a new user.
Marc Perkel
2007-08-15 17:54:13 UTC
Permalink
Post by Kyle Moffett
Post by Kyle Moffett
Going even further in this direction, the
following POSIX ACL on the
Post by Kyle Moffett
## Note: file owner and group are kmoffett
u::rw-
g::rw-
u:lsorens:rw-
u:mtharp:rw-
u:mperkel:rw-
g:randomcvsdudes:r-
default:u::rw-
default:g::rw-
default:u:lsorens
default:u:mtharp:rw-
default:u:mperkel:rw-
default:g:randomcvsdudes:r-
The problem that I have with this setup is that it
specifies an ACL on
EACH file. Yes, you can set a default on the
directory for newly
created files, but what if I want to add a user to
the access list for
that whole directory? I have to individually update
every acl on every
file in that directory. Also if you move a file
created elsewhere into
that directory, it retains its existing permissions
doesn't it? I would
rather just add a new ace to the directory itself
which specifies that
it applies to the entire tree. Then you only need
to store a single acl
on disk, and only have to update one acl to add a
new user.
In the model I'm suggesting files and directories no
longer have permissions so ACLs go away. Only users,
groups, managers, applications, and other objects have
permissions.

So if you move a file into the tree then everything
that has permission to that tree has rights to the
file.


Marc Perkel
Junk Email Filter dot com
http://www.junkemailfilter.com



____________________________________________________________________________________
Boardwalk for $500? In 2007? Ha! Play Monopoly Here and Now (it's updated for today's economy) at Yahoo! Games.
http://get.games.yahoo.com/proddesc?gamekey=monopolyherenow
Kyle Moffett
2007-08-15 17:53:53 UTC
Permalink
Post by Phillip Susi
The problem that I have with this setup is that it specifies an ACL
on EACH file. Yes, you can set a default on the directory for
newly created files, but what if I want to add a user to the access
list for that whole directory? I have to individually update every
acl on every file in that directory.
We've *always* had to do this; that's what "chmod -R" or "setfacl -R"
are for :-D. The major problem is that the locking and lookup
overhead gets really significant if you have to look at the entire
directory tree in order to determine the permissions for one single
object. I definitely agree that we need better GUIs for managing
file permissions, but I don't see how you could modify the kernel in
this case to do what you want.
Post by Phillip Susi
Also if you move a file created elsewhere into that directory, it
retains its existing permissions doesn't it?
So what would you have happen when you move another directory into
that directory? Should it retain its permissions? If they change
based on the new directory then do you recurse into each
subdirectory? Such recursing to modify permissions also has
significant performance implications. What about if the file is
hardlinked elsewhere; do those permissions change?


There's also the question of what to do about namespaces and bind
mounts. If I run "mount --bind /foo /home/foo", then do I get
different file permissions depending on what path I access the file
by? What if I then run chroot("/foo"), do I get different file
permissions then? What if I have two namespaces, each with their own
root filesystem (say "root1" and "root2"), and I mount the other
namespace's root filesystem in a subdir of each:
NS1: mount /dev/root2 /otherns
NS2: mount /dev/root1 /otherns

Now I have the following paths to the same file, do these get
different permissions or the same?
NS1:/foo == NS2:/otherns/foo
NS2:/bar == NS1:/otherns/bar

If you answered that they get different permissions, then how do you
handle the massive locking performance penalty due to the extra
lookups? If you answered "same permissions", then how do you explain
the obvious discrepancy to the admin?
Post by Phillip Susi
I would rather just add a new ace to the directory itself which
specifies that it applies to the entire tree. Then you only need
to store a single acl on disk, and only have to update one acl to
add a new user.
The idea is nice, but as soon as you add multiple namespaces,
chroots, or bind mounts the concept breaks down. For security-
sensitive things like file permissions, you really do want
determinate behavior regardless of the path used to access the data.

Cheers,
Kyle Moffett
Marc Perkel
2007-08-15 18:05:23 UTC
Permalink
Kyle,

In this new system setfacl, chmod, chown, and chgrp
all go away except inside of an emulation layer. File
and directories no longer have permissions. People
have permission to naming patterns. So if you put a
file into a tree or move a tree then those who have
permissions to the tree have access to the files.

It eliminates the step of having to apply permission
after moving files into a tree. You don't have to
change file permissions because files no longer have
permissions.


Marc Perkel
Junk Email Filter dot com
http://www.junkemailfilter.com


____________________________________________________________________________________
Shape Yahoo! in your own image. Join our Network Research Panel today! http://surveylink.yahoo.com/gmrs/yahoo_panel_invite.asp?a=7
Kyle Moffett
2007-08-15 18:14:27 UTC
Permalink
In this new system setfacl, chmod, chown, and chgrp all go away
except inside of an emulation layer. File and directories no longer
have permissions. People have permission to naming patterns. So if
you put a file into a tree or move a tree then those who have
permissions to the tree have access to the files.
It eliminates the step of having to apply permission after moving
files into a tree. You don't have to change file permissions
because files no longer have permissions.
And I'm trying to tell you that unless you have some magic new
algorithm that turns NP-complete problems into O(log(N)) problems,
your idea won't work. You can't just say "I just do one little thing
(mv) and the entire rest of the computer automagically changes to
match", because that would imply a single unscalable global kernel
lock. "Pattern"-matching is either NP-complete or high-polynomial-
order, depending on how its implemented, and if you want to do a
recursive-chmod during a directory move then you're going to have
race-conditions out the ass. If you have code or solid math to back
up your postings then please do so, but otherwise you're just wasting
time and bandwidth.

Cheers,
Kyle Moffett
Marc Perkel
2007-08-15 20:20:07 UTC
Permalink
Post by Marc Perkel
In this new system setfacl, chmod, chown, and
chgrp all go away
Post by Marc Perkel
except inside of an emulation layer. File and
directories no longer
Post by Marc Perkel
have permissions. People have permission to naming
patterns. So if
Post by Marc Perkel
you put a file into a tree or move a tree then
those who have
Post by Marc Perkel
permissions to the tree have access to the files.
It eliminates the step of having to apply
permission after moving
Post by Marc Perkel
files into a tree. You don't have to change file
permissions
Post by Marc Perkel
because files no longer have permissions.
And I'm trying to tell you that unless you have some
magic new
algorithm that turns NP-complete problems into
O(log(N)) problems,
your idea won't work. You can't just say "I just do
one little thing
(mv) and the entire rest of the computer
automagically changes to
match", because that would imply a single unscalable
global kernel
lock. "Pattern"-matching is either NP-complete or
high-polynomial-
order, depending on how its implemented, and if you
want to do a
recursive-chmod during a directory move then you're
going to have
race-conditions out the ass. If you have code or
solid math to back
up your postings then please do so, but otherwise
you're just wasting
time and bandwidth.
Kyle - you are still missing the point. chmod goes
away. File permissions goes away. Directories as you
know them goes away.



Marc Perkel
Junk Email Filter dot com
http://www.junkemailfilter.com



____________________________________________________________________________________
Got a little couch potato?
Check out fun summer activities for kids.
http://search.yahoo.com/search?fr=oni_on_mail&p=summer+activities+for+kids&cs=bz
Phillip Susi
2007-08-15 20:43:44 UTC
Permalink
Post by Marc Perkel
Kyle - you are still missing the point. chmod goes
away. File permissions goes away. Directories as you
know them goes away.
You are missing the point Marc... open()ing a file will have to perform
a number of these pattern matches to decide if it is allowed or not...
this would be a HUGE overhead.
Marc Perkel
2007-08-15 20:50:17 UTC
Permalink
Post by Phillip Susi
Post by Marc Perkel
Kyle - you are still missing the point. chmod goes
away. File permissions goes away. Directories as
you
Post by Marc Perkel
know them goes away.
You are missing the point Marc... open()ing a file
will have to perform
a number of these pattern matches to decide if it is
allowed or not...
this would be a HUGE overhead.
I don't see it as being any worse that what we have
now. To open a file you have to start at the bottom
and open each directory and evaluate the permissions
on the way to the file. In my system you have to look
up the permission of the string at each "/" separator.
Seems to me that every system would have these same
steps.




Marc Perkel
Junk Email Filter dot com
http://www.junkemailfilter.com


____________________________________________________________________________________
Park yourself in front of a world of choices in alternative vehicles. Visit the Yahoo! Auto Green Center.
http://autos.yahoo.com/green_center/
V***@vt.edu
2007-08-15 21:20:37 UTC
Permalink
Post by Marc Perkel
I don't see it as being any worse that what we have
now. To open a file you have to start at the bottom
and open each directory and evaluate the permissions
on the way to the file. In my system you have to look
up the permission of the string at each "/" separator.
Seems to me that every system would have these same
steps.
No - you need to look at the *whole* string - that's the
whole *point* of your system, remember?

Just a few msgs back, you gave a nice example of
having a file with \ in the name rather than / because
it came from a Windows user. So you *do* need to check *every*
pattern against the filename, because it *could* match.
In a system with several hundred thousand or more patterns,
that could be painful.

Also, you need to figure out how to deal with all the various
silly corner cases that people will end up trying to do.

Consider the rules:

peter '*a*' can create
peter '*b*' cannot create

Peter tries to create 'foo-ab-bar' - is he allowed to or not?

For an exersize, either write a program or do by hand:

Create a list of patterns that correctly express the ownership
and permissions of *every* file on your current Linux box.

Then repeat on a large box with multiple users and a few Oracle
databases or webservers.

Then write a small tool, that given that list, a username,
a filename, and the operation (read, write, open, unlink, etc),
says "Yes or No".

Then run 'strace /bin/ls' in a large directory, take all the
filenames listed in the strace output, and see if your tool can
answer "yes or no" fast enough to make 'ls' feasible.

Come back when you get that part done, and we'll discuss how it
would have to work in the kernel.
Marc Perkel
2007-08-15 22:48:15 UTC
Permalink
Post by Marc Perkel
Post by Marc Perkel
I don't see it as being any worse that what we
have
Post by Marc Perkel
now. To open a file you have to start at the
bottom
Post by Marc Perkel
and open each directory and evaluate the
permissions
Post by Marc Perkel
on the way to the file. In my system you have to
look
Post by Marc Perkel
up the permission of the string at each "/"
separator.
Post by Marc Perkel
Seems to me that every system would have these
same
Post by Marc Perkel
steps.
No - you need to look at the *whole* string - that's
the
whole *point* of your system, remember?
Just a few msgs back, you gave a nice example of
having a file with \ in the name rather than /
because
it came from a Windows user. So you *do* need to
check *every*
pattern against the filename, because it *could*
match.
In a system with several hundred thousand or more
patterns,
that could be painful.
Also, you need to figure out how to deal with all
the various
silly corner cases that people will end up trying to
do.
peter '*a*' can create
peter '*b*' cannot create
Peter tries to create 'foo-ab-bar' - is he allowed
to or not?
First - I'm proposing a concept, not writing the
implementation of the concept. You are asking what
happens when someone write conflicting rules. That
depends on how you implement it. Conflicting rules can
cause unpredictable results.

It may be as simple as first rule wins. Or it may
require all the rules to be true. In the above example
I would say it is not allowed because it matches a
denial condition.

The point is that one can choose any rule system they
want and the rules applies to the names of the files
and the permissions of the users.
Post by Marc Perkel
For an exersize, either write a program or do by
Create a list of patterns that correctly express the
ownership
and permissions of *every* file on your current
Linux box.
Then repeat on a large box with multiple users and a
few Oracle
databases or webservers.
Then write a small tool, that given that list, a
username,
a filename, and the operation (read, write, open,
unlink, etc),
says "Yes or No".
Then run 'strace /bin/ls' in a large directory, take
all the
filenames listed in the strace output, and see if
your tool can
answer "yes or no" fast enough to make 'ls'
feasible.
Come back when you get that part done, and we'll
discuss how it
would have to work in the kernel.
All you would have to do is create a set of rules that
emulates the current rules and you would have the same
results.
Marc Perkel
Junk Email Filter dot com
http://www.junkemailfilter.com



____________________________________________________________________________________
Boardwalk for $500? In 2007? Ha! Play Monopoly Here and Now (it's updated for today's economy) at Yahoo! Games.
http://get.games.yahoo.com/proddesc?gamekey=monopolyherenow
V***@vt.edu
2007-08-16 03:42:45 UTC
Permalink
Post by Marc Perkel
Post by V***@vt.edu
peter '*a*' can create
peter '*b*' cannot create
Peter tries to create 'foo-ab-bar' - is he allowed
to or not?
First - I'm proposing a concept, not writing the
implementation of the concept. You are asking what
happens when someone write conflicting rules. That
depends on how you implement it. Conflicting rules can
cause unpredictable results.
Good. Go work out what the rules have to be in order for the system to
behave sanely. "Hand-waving concept" doesn't get anywhere. Fully fleshed-out
concepts sometimes do - once they sprout code to actually implement them.
Post by Marc Perkel
The point is that one can choose any rule system they
want and the rules applies to the names of the files
and the permissions of the users.
No, you *can't* choose any rule system you want - some rule systems are
unworkable because they create security exposures (usually of the
"ln /etc/passwd /tmp/foo" variety, but sometimes race conditions as well).
Post by Marc Perkel
Post by V***@vt.edu
For an exersize, either write a program or do by
All you would have to do is create a set of rules that
emulates the current rules and you would have the same
results.
Good. Go create it. Let us know when you're done. Remember - not only do
you need to have it generate the same results, you need to find a way to
implement it so that it's somewhere near the same speed as the current code.
If it's 10 times slower, it's not going to fly no matter *how* "cool" it is.
Phillip Susi
2007-08-15 20:38:36 UTC
Permalink
Post by Kyle Moffett
We've *always* had to do this; that's what "chmod -R" or "setfacl -R"
are for :-D. The major problem is that the locking and lookup overhead
gets really significant if you have to look at the entire directory tree
in order to determine the permissions for one single object. I
definitely agree that we need better GUIs for managing file permissions,
but I don't see how you could modify the kernel in this case to do what
you want.
I am well aware of that, I'm simply saying that sucks. Doing a
recursive chmod or setfacl on a large directory tree is slow as all hell.
Post by Kyle Moffett
So what would you have happen when you move another directory into that
directory? Should it retain its permissions? If they change based on
the new directory then do you recurse into each subdirectory? Such
recursing to modify permissions also has significant performance
implications. What about if the file is hardlinked elsewhere; do those
permissions change?
Simple... the file retains any acl it already had, AND the acl of the
new directory now applies. Most likely the moved file had no acl and
was just inheriting its effective acl from its old parent directory.
The end result is that people who used to have access to the file by
virtue of it being in their directory no longer do, and the people who
are supposed to have access to all files in the new directory get access
to this one.

As for hard links, your access would depend on which name you use to
access the file. The file itself may still have an acl that grants or
denies access to people no matter what name they use, but if it allows
inheritance, then which name you access it by will modify the effective
acl that it gets.

As for performance implications, I hardly think it is worrisome. Each
directory in the path has to be looked up anyhow so you already have
their acls, so when you finally reach the file, you just have to take
the union of the acls encountered on the way. Should only be a few cpu
cycles.
Post by Kyle Moffett
There's also the question of what to do about namespaces and bind
mounts. If I run "mount --bind /foo /home/foo", then do I get different
file permissions depending on what path I access the file by? What if I
then run chroot("/foo"), do I get different file permissions then? What
if I have two namespaces, each with their own root filesystem (say
"root1" and "root2"), and I mount the other namespace's root filesystem
NS1: mount /dev/root2 /otherns
NS2: mount /dev/root1 /otherns
Now I have the following paths to the same file, do these get different
permissions or the same?
NS1:/foo == NS2:/otherns/foo
NS2:/bar == NS1:/otherns/bar
If you answered that they get different permissions, then how do you
handle the massive locking performance penalty due to the extra
lookups? If you answered "same permissions", then how do you explain
the obvious discrepancy to the admin?
Good question. I would say the bind mount should have a flag to
specify. That way the admin can choose where it should inherit from. I
also don't see where this locking penalty is.
Post by Kyle Moffett
The idea is nice, but as soon as you add multiple namespaces, chroots,
or bind mounts the concept breaks down. For security-sensitive things
like file permissions, you really do want determinate behavior
regardless of the path used to access the data.
How does it break down? Chroots have absolutely no impact at all, and
the bind mounts/namespaces can be handled as I mentioned above. If you
really want to be sure of the effective permissions on the file, then
you simply flag it to not inherit from its parent or use an inherited
rights mask to block the specific inherited permissions you want.
Kyle Moffett
2007-08-15 21:17:17 UTC
Permalink
Al Viro added to the CC, since he's one of the experts on this stuff
and will probably whack me with a LART for explaining it all wrong,
or something. :-D
Post by Phillip Susi
Post by Kyle Moffett
We've *always* had to do this; that's what "chmod -R" or "setfacl -
R" are for :-D. The major problem is that the locking and lookup
overhead gets really significant if you have to look at the entire
directory tree in order to determine the permissions for one
single object. I definitely agree that we need better GUIs for
managing file permissions, but I don't see how you could modify
the kernel in this case to do what you want.
I am well aware of that, I'm simply saying that sucks. Doing a
recursive chmod or setfacl on a large directory tree is slow as all hell.
Doing it in the kernel won't make it any faster.
Post by Phillip Susi
As for hard links, your access would depend on which name you use
to access the file. The file itself may still have an acl that
grants or denies access to people no matter what name they use, but
if it allows inheritance, then which name you access it by will
modify the effective acl that it gets.
You can't safely preserve POSIX semantics that way. For example,
even without *ANY* ability to read /etc/shadow, I can easily "ln /etc/
shadow /tmp/shadow", assuming they are on the same filesystem. If
the /etc/shadow permissions depend on inherited ACLs to enforce
access then that one little command just made your shadow file world-
readable/writeable. Oops.

Think about it this way:
Permissions depend on *what* something is, not *where* it is. Under
Linux you can leave the digital equivalent of a $10,000 piece of
jewelry lying around in /var/www and not have to worry about it being
compromised as long as you set your permissions properly (not that I
recommend it). Moving the piece of jewelry around your house does
not change what it *is* (and by extension does not change the
protection required on it), any more than "ln /etc/shadow /tmp/
shadow" (or "mv") changes what *it* is. If your /house is really
extraordinarily secure then you could leave the jewelry lying around
as /house/gems.bin with permissions 0777, but if somebody had a back-
door to /house (an open fd, a careless typo, etc) then you'd have the
same issues.
Post by Phillip Susi
As for performance implications, I hardly think it is worrisome.
Each directory in the path has to be looked up anyhow so you
already have their acls, so when you finally reach the file, you
just have to take the union of the acls encountered on the way.
Should only be a few cpu cycles.
Not necessarily. When I do "vim some-file-in-current-directory", for
example, the kernel does *NOT* look up the path of my current
directory. It does (in pseudocode):

if (starts_with_slash(filename)) {
entry = task->cwd;
} else {
entry = task->root;
}
while (have_components_left(filename)
entry = lookup_next_component(filename);
return entry;


That's not even paying attention to functions like "fchdir" or their
interactions with "chroot" and namespaces. I can probably have an
open directory handle to a volume in a completely different
namespace, a volume which isn't even *MOUNTED* in my current fs
namespace. Using that file-handle I believe I can "fchdir",
"openat", etc, in a completely different namespace. I can do the
same thing with a chroot, except there I can even "escape":
/* Switch into chroot. Doesn't drop root privs */
chdir("/some/dir/somewhere");
chroot(".");

/* Malicious code later on */
chdir("/");
chroot("another_dir");
chdir("../../../../../../../../..");
chroot(".");
/* Now I'm back in the real root filesystem */
Post by Phillip Susi
Post by Kyle Moffett
There's also the question of what to do about namespaces and bind
mounts. If I run "mount --bind /foo /home/foo", then do I get
different file permissions depending on what path I access the
file by? What if I then run chroot("/foo"), do I get different
file permissions then? What if I have two namespaces, each with
their own root filesystem (say "root1" and "root2"), and I mount
NS1: mount /dev/root2 /otherns
NS2: mount /dev/root1 /otherns
Now I have the following paths to the same file, do these get
different permissions or the same?
NS1:/foo == NS2:/otherns/foo
NS2:/bar == NS1:/otherns/bar
If you answered that they get different permissions, then how do
you handle the massive locking performance penalty due to the
extra lookups? If you answered "same permissions", then how do
you explain the obvious discrepancy to the admin?
Good question. I would say the bind mount should have a flag to
specify. That way the admin can choose where it should inherit
from. I also don't see where this locking penalty is.
The locking penalty is because the path-lookup is *not* implied. The
above chroot example shows that in detail. If you have to do the
lookup in *reverse* on every open operation then you have to either:

(A) Store lots of security context with every open directory (cwd
included). When a directory you have open is moved, you still have
full access to everything inside it since your handle's data hasn't
changed.

OR

(B) Do a reverse-lookup on every operation, which means you are
taking locks in the reverse order from the normal lookup order and
you're either very inefficient and slow or you're deadlocky. This is
also vulnerable to confusion when the "path" isn't visible from your
namespace/chroot at all. This also causes problems with overmounted
directories:
***@ares:/# mkdir -p /foo/bar
***@ares:/# cd /foo/bar
***@ares:/foo/bar# mount -t tmpfs tmpfs /foo
***@ares:/foo/bar# touch /foo/bar/baz
touch: cannot touch `/foo/bar/baz': No such file or directory
***@ares:/foo/bar# pwd
/foo/bar
***@ares:/foo/bar# ls -al
total 4
drwxr-xr-x 2 root root 4096 2007-08-15 17:04 .
drwxrwxrwt 2 root root 40 2007-08-15 17:04 ..
Post by Phillip Susi
Post by Kyle Moffett
The idea is nice, but as soon as you add multiple namespaces,
chroots, or bind mounts the concept breaks down. For security-
sensitive things like file permissions, you really do want
determinate behavior regardless of the path used to access the data.
How does it break down? Chroots have absolutely no impact at all,
and the bind mounts/namespaces can be handled as I mentioned
above. If you really want to be sure of the effective permissions
on the file, then you simply flag it to not inherit from its parent
or use an inherited rights mask to block the specific inherited
permissions you want.
See above. Linux has a very *very* powerful VFS nowadays. It also
has support for shared subtrees across private namespaces, allowing
things like polyinstantiation. For example you can configure a Linux
box so every user gets their own empty /tmp directory created and
mounted on their namespace's /tmp when they log in, but all the rest
of the system mounts are all shared.

I'll be offline for a couple days, we can chat more when I get back.

Cheers,
Kyle Moffett
Phillip Susi
2007-08-15 22:14:44 UTC
Permalink
Post by Kyle Moffett
Post by Phillip Susi
I am well aware of that, I'm simply saying that sucks. Doing a
recursive chmod or setfacl on a large directory tree is slow as all hell.
Doing it in the kernel won't make it any faster.
Right... I'm talking about getting rid of it entirely.
Post by Kyle Moffett
You can't safely preserve POSIX semantics that way. For example, even
without *ANY* ability to read /etc/shadow, I can easily "ln /etc/shadow
/tmp/shadow", assuming they are on the same filesystem. If the
/etc/shadow permissions depend on inherited ACLs to enforce access then
that one little command just made your shadow file
world-readable/writeable. Oops.
That's why /etc/shadow would have an IRM that would block inheriting any
permissions.
Post by Kyle Moffett
Permissions depend on *what* something is, not *where* it is. Under
No, they don't. At least not always. A lot of times people want
permissions based on where it is.
Post by Kyle Moffett
Linux you can leave the digital equivalent of a $10,000 piece of jewelry
lying around in /var/www and not have to worry about it being
compromised as long as you set your permissions properly (not that I
recommend it). Moving the piece of jewelry around your house does not
change what it *is* (and by extension does not change the protection
required on it), any more than "ln /etc/shadow /tmp/shadow" (or "mv")
changes what *it* is. If your /house is really extraordinarily secure
then you could leave the jewelry lying around as /house/gems.bin with
permissions 0777, but if somebody had a back-door to /house (an open fd,
a careless typo, etc) then you'd have the same issues.
Yes, but if you move it to your front porch, you have changed the
protection on it. If you want to be sure of it being protected, keep it
locked up in a safe... in other words, use the IRM.
Post by Kyle Moffett
Not necessarily. When I do "vim some-file-in-current-directory", for
example, the kernel does *NOT* look up the path of my current
if (starts_with_slash(filename)) {
entry = task->cwd;
} else {
entry = task->root;
}
while (have_components_left(filename)
entry = lookup_next_component(filename);
return entry;
Right.... and task->cwd would have the effective acl in memory, ready to
be combined with any acl set on the file.
Post by Kyle Moffett
That's not even paying attention to functions like "fchdir" or their
interactions with "chroot" and namespaces. I can probably have an open
directory handle to a volume in a completely different namespace, a
volume which isn't even *MOUNTED* in my current fs namespace. Using
that file-handle I believe I can "fchdir", "openat", etc, in a
completely different namespace. I can do the same thing with a chroot,
/* Switch into chroot. Doesn't drop root privs */
chdir("/some/dir/somewhere");
chroot(".");
/* Malicious code later on */
chdir("/");
chroot("another_dir");
chdir("../../../../../../../../..");
chroot(".");
/* Now I'm back in the real root filesystem */
I don't see what this has to do with this discussion, and I also can't
believe that is correct... the chdir( "../../../../.." ) should fail
because there is no such directory.
Post by Kyle Moffett
The locking penalty is because the path-lookup is *not* implied. The
above chroot example shows that in detail. If you have to do the lookup
(A) Store lots of security context with every open directory (cwd
included). When a directory you have open is moved, you still have full
access to everything inside it since your handle's data hasn't changed.
Yes, the effective acl of the open directory is kept in memory, but in
the directory itself, not the handle to it, thus when the directory is
moved, it's acl is recomputed for the new location and updated
immediately. It is like using fcntl to set a file to non blocking... it
is the file you set, not the handle to it, so it effects other processes
that have inherited or duplicated the file.
Post by Kyle Moffett
See above. Linux has a very *very* powerful VFS nowadays. It also has
support for shared subtrees across private namespaces, allowing things
like polyinstantiation. For example you can configure a Linux box so
every user gets their own empty /tmp directory created and mounted on
their namespace's /tmp when they log in, but all the rest of the system
mounts are all shared.
I still can't see how these features cause a problem.
Kyle Moffett
2007-08-16 04:44:25 UTC
Permalink
Mmm, slow-as-dirt hotel wireless. What fun...
Post by Phillip Susi
Post by Kyle Moffett
Post by Phillip Susi
I am well aware of that, I'm simply saying that sucks. Doing a
recursive chmod or setfacl on a large directory tree is slow as all hell.
Doing it in the kernel won't make it any faster.
Right... I'm talking about getting rid of it entirely.
Let me repeat myself here: Algorithmically you fundamentally CANNOT
implement inheritance-based ACLs without one of the following
(although if you have some other algorithm in mind, I'm listening):
(A) Some kind of recursive operation *every* time you change an
inheritable permission
(B) A unified "starting point" from which you begin *every* access-
control lookup (or one "starting point" per useful semantic grouping,
like a namespace).

The "(A)" is presently done in userspace and that's what you want to
avoid. As to (B), I will attempt to prove below that you cannot
implement "(B)" without breaking existing assumptions and restricting
a very nice VFS model.
Post by Phillip Susi
Post by Kyle Moffett
Not necessarily. When I do "vim some-file-in-current-directory",
for example, the kernel does *NOT* look up the path of my current
if (starts_with_slash(filename)) {
entry = task->cwd;
} else {
entry = task->root;
}
while (have_components_left(filename)
entry = lookup_next_component(filename);
return entry;
Right.... and task->cwd would have the effective acl in memory,
ready to be combined with any acl set on the file.
What ACL would "task->cwd" use?

Options:
(1.a) Use the one calculated during the original chdir() call.
(1.b) Navigate "up" task->cwd building an ACL backwards.
(1.c) $CAN_YOU_THINK_OF_SOMETHING_ELSE_HERE

Unsolvable problems with each option:

(1.a.I)
You just broke all sorts of chrooted daemons. When I start bind in
its chroot jail, it does the following:
chdir("/private/bind9");
chroot(".");
setgid(...);
setuid(...);
The "/private" directory is readable only by root, since root is the
only one who will be navigating you into these chroots for any
reason. You only switch UID/GID after the chroot() call, at which
point you are inside of a sub-context and your cwd is fully
accessible. If you stick an inheritable ACL on "/private", then the
"cwd" ACL will not allow access by anybody but root and my bind won't
be able to read any config files.

You also break relative paths and directory-moving. Say a process
does chdir("/foo/bar"). Now the ACL data in "cwd" is appropriate
for /foo/bar. If you later chdir("../quux"), how do you unapply the
changes made when you switched into that directory? For inheritable
ACLs, you can't "unapply" such an ACL state change unless you save
state for all the parent directories, except... What happens when
you are in "/foo/bar" and another process does "mv /foo/bar /foobar/
quux"? Suddenly any "cwd" ACL data you have is completely invalid
and you have to rebuild your ACLs from scratch. Moreover, if the
directory you are in was moved to a portion of the filesystem not
accessible from your current namespace then how do you deal with it?

For example:
NS1 has the / root dir of /dev/sdb1 mounted on /mnt
NS2 has the /bar subdir of /dev/sdb1 mounted on /mnt

Your process is in NS2 and does chdir("/mnt/quux"). A user in NS1
does: "mv /mnt/bar/quux /mnt/quux". Now your "cwd" is in a directory
on a filesystem you have mounted, but it does not correspond *AT ALL*
to any path available from your namespace.

Another example:
Your process has done dirfd=open("/media/cdrom/somestuff") when the
admin does "umount -l /media/cdrom". You still have the CD-ROM open
and accessible but IT HAS NO PATH. It isn't even mounted in *any*
namespace, it's just kind of dangling waiting for its last users to
go away. You can still do fchdir(dirfd), openat(dirfd, "foo/
bar", ...), open("./foo"), etc.

In Linux the ONLY distinction between "relative" and "absolute" paths
is that the "absolute" path begins with a magic slash which implies
that you start at the hidden "root" fd the kernel manages.
Post by Phillip Susi
Post by Kyle Moffett
That's not even paying attention to functions like "fchdir" or
their interactions with "chroot" and namespaces. I can probably
have an open directory handle to a volume in a completely
different namespace, a volume which isn't even *MOUNTED* in my
current fs namespace. Using that file-handle I believe I can
"fchdir", "openat", etc, in a completely different namespace. I
can do the same thing with a chroot, except there I can even
/* Switch into chroot. Doesn't drop root privs */
chdir("/some/dir/somewhere");
chroot(".");
/* Malicious code later on */
chdir("/");
chroot("another_dir");
chdir("../../../../../../../../..");
chroot(".");
/* Now I'm back in the real root filesystem */
I don't see what this has to do with this discussion, and I also
can't believe that is correct... the chdir( "../../../../.." )
should fail because there is no such directory.
No, this is correct because in the root directory "/", the ".." entry
is just another link to the root directory. So the absolute path
"/../../../../../.." is just a fancy name for the root directory.
The above jail-escape-as-root exploit is possible because it is
impossible to determine whether a directory is or is not a subentry
of another directory without an exhaustive search. So when your
"cwd" points to a path outside of the chroot, the one special case in
the code for the "root" directory does not ever match and you can
"chdir" all the way up to the real root. You can even do an fstat()
after every iteration to figure out whether you're there or not!

And yes, this has been exploited before, although not often as chroot
()-ed uid=0 daemons aren't all that common.

So, pray tell, when this code runs and you do the "chroot" call, what
ACL do you think should get stuck on "cwd"? It doesn't reference
anything available relative to the chroot.
Post by Phillip Susi
Post by Kyle Moffett
The locking penalty is because the path-lookup is *not* implied.
The above chroot example shows that in detail. If you have to do
the lookup in *reverse* on every open operation then you have to
(A) Store lots of security context with every open directory
(cwd included). When a directory you have open is moved, you
still have full access to everything inside it since your handle's
data hasn't changed.
Yes, the effective acl of the open directory is kept in memory, but
in the directory itself, not the handle to it, thus when the
directory is moved, it's acl is recomputed for the new location and
updated immediately. It is like using fcntl to set a file to non
blocking... it is the file you set, not the handle to it, so it
effects other processes that have inherited or duplicated the file.
With this you just got into the big-ugly-nasty-recursive-behavior
again. Say I untar 20 kernel source trees and then have my program
open all 1000 available FDs to various directories in the kernel
source tree. Now I run 20 copies of this program, one for each tree,
still well within my ulimits even on a conservative box. Now run "mv
dir_full_of_kernel_sources some/new/dir". The only thing you can do
to find all of the FDs is to iterate down the entire subdirectory
tree looking for open files and updating their contexts one-by-one.
Except you have 20,000 directory FDs to update. Ouch.

To sum up, when doing access control the only values you can safely
and efficiently get at are:
(A) The dentry/inode
(B) The superblock
(C) *Maybe* the vfsmount if those patches get accepted

Any access control model which tries to poke other values is just
going to have a shitload of corner cases where it just falls over.

Cheers,
Kyle Moffett
Phillip Susi
2007-08-16 15:09:16 UTC
Permalink
Post by Kyle Moffett
Let me repeat myself here: Algorithmically you fundamentally CANNOT
implement inheritance-based ACLs without one of the following (although
(A) Some kind of recursive operation *every* time you change an
inheritable permission
(B) A unified "starting point" from which you begin *every*
access-control lookup (or one "starting point" per useful semantic
grouping, like a namespace).
The "(A)" is presently done in userspace and that's what you want to
avoid. As to (B), I will attempt to prove below that you cannot
implement "(B)" without breaking existing assumptions and restricting a
very nice VFS model.
No recursion is needed because only one acl exists, so that is the only
one you need to update. At least on disk. Any cached acls in memory of
descendant objects would need updated, but the number of those should be
relatively small. The starting point would be the directory you start
the lookup from. That may be the root, or it may be some other
directory that you have a handle to, and thus, already has its effective
acl computed.
Post by Kyle Moffett
What ACL would "task->cwd" use?
(1.a) Use the one calculated during the original chdir() call.
(1.b) Navigate "up" task->cwd building an ACL backwards.
(1.c) $CAN_YOU_THINK_OF_SOMETHING_ELSE_HERE
1.a
Post by Kyle Moffett
(1.a.I)
You just broke all sorts of chrooted daemons. When I start bind in its
chdir("/private/bind9");
chroot(".");
setgid(...);
setuid(...);
The "/private" directory is readable only by root, since root is the
only one who will be navigating you into these chroots for any reason.
You only switch UID/GID after the chroot() call, at which point you are
inside of a sub-context and your cwd is fully accessible. If you stick
an inheritable ACL on "/private", then the "cwd" ACL will not allow
access by anybody but root and my bind won't be able to read any config
files.
If you want the directory to be root accessible but the files inside to
have wider access then you set the acl on the directory to have one ace
granting root access to the directory, and one ace that is inheritable
granting access to bind. This latter ace does not apply to the
directory itself, only to its children.
Post by Kyle Moffett
You also break relative paths and directory-moving. Say a process does
chdir("/foo/bar"). Now the ACL data in "cwd" is appropriate for
/foo/bar. If you later chdir("../quux"), how do you unapply the changes
made when you switched into that directory? For inheritable ACLs, you
can't "unapply" such an ACL state change unless you save state for all
the parent directories, except... What happens when you are in
"/foo/bar" and another process does "mv /foo/bar /foobar/quux"?
Suddenly any "cwd" ACL data you have is completely invalid and you have
to rebuild your ACLs from scratch. Moreover, if the directory you are
in was moved to a portion of the filesystem not accessible from your
current namespace then how do you deal with it?
Yes, if /foo/quux is not already cached in memory, you would have to
walk the tree to build its acl. /foo should already be cached in memory
so this work is minimal. Is this so horrible of a problem?

As for moving, it is handled the same way as any other event that makes
cwd go away, such as deleting it or revoking your access; cwd is now
invalid.
Post by Kyle Moffett
NS1 has the / root dir of /dev/sdb1 mounted on /mnt
NS2 has the /bar subdir of /dev/sdb1 mounted on /mnt
"mv /mnt/bar/quux /mnt/quux". Now your "cwd" is in a directory on a
filesystem you have mounted, but it does not correspond *AT ALL* to any
path available from your namespace.
Which would be no different than if they just deleted the entire thing.
Your cwd no longer exists.
Post by Kyle Moffett
Your process has done dirfd=open("/media/cdrom/somestuff") when the
admin does "umount -l /media/cdrom". You still have the CD-ROM open and
accessible but IT HAS NO PATH. It isn't even mounted in *any*
namespace, it's just kind of dangling waiting for its last users to go
away. You can still do fchdir(dirfd), openat(dirfd, "foo/bar", ...),
open("./foo"), etc.
What's this got to do with acls? If you are asking what effect the
umount thas on the acls of the cdrom, the answer is none. The acls are
on the disc and nothing on the disc has changed.
Post by Kyle Moffett
No, this is correct because in the root directory "/", the ".." entry is
just another link to the root directory. So the absolute path
"/../../../../../.." is just a fancy name for the root directory. The
above jail-escape-as-root exploit is possible because it is impossible
to determine whether a directory is or is not a subentry of another
directory without an exhaustive search. So when your "cwd" points to a
path outside of the chroot, the one special case in the code for the
"root" directory does not ever match and you can "chdir" all the way up
to the real root. You can even do an fstat() after every iteration to
figure out whether you're there or not!
Ohh, I see... yes... that is a very clever way for root to misuse
chroot(). What does it have to do with this discussion?
Post by Kyle Moffett
And yes, this has been exploited before, although not often as
chroot()-ed uid=0 daemons aren't all that common.
So, pray tell, when this code runs and you do the "chroot" call, what
ACL do you think should get stuck on "cwd"? It doesn't reference
anything available relative to the chroot.
Same root abuse, same result. The acl on the cwd would still be exactly
what it was before the chroot.
Post by Kyle Moffett
With this you just got into the big-ugly-nasty-recursive-behavior
again. Say I untar 20 kernel source trees and then have my program open
all 1000 available FDs to various directories in the kernel source
tree. Now I run 20 copies of this program, one for each tree, still
well within my ulimits even on a conservative box. Now run "mv
dir_full_of_kernel_sources some/new/dir". The only thing you can do to
find all of the FDs is to iterate down the entire subdirectory tree
looking for open files and updating their contexts one-by-one. Except
you have 20,000 directory FDs to update. Ouch.
Ok, so you found a pedantic corner case that is slow. So? And it is
still going to be faster than chmod -R.
Post by Kyle Moffett
To sum up, when doing access control the only values you can safely and
(A) The dentry/inode
(B) The superblock
(C) *Maybe* the vfsmount if those patches get accepted
Any access control model which tries to poke other values is just going
to have a shitload of corner cases where it just falls over.
If by falls over you mean takes some time, then yes.... so what?
V***@vt.edu
2007-08-16 15:29:22 UTC
Permalink
Post by Phillip Susi
No recursion is needed because only one acl exists, so that is the only
one you need to update. At least on disk. Any cached acls in memory of
descendant objects would need updated, but the number of those should be
relatively small.
On my laptop (this is a *laptop*, mind you):

% df -i /home
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/mapper/VolGroup00-home
655360 532361 122999 82% /home

What happens if I do a 'mv /home /home1'? Looks like more than a "relatively
small" number. A cold-cache 'find' takes a few minutes to wade through it all,
so any solutions you come up with should beware of locking issues...
Phillip Susi
2007-08-16 17:28:26 UTC
Permalink
Post by V***@vt.edu
What happens if I do a 'mv /home /home1'? Looks like more than a "relatively
small" number. A cold-cache 'find' takes a few minutes to wade through it all,
so any solutions you come up with should beware of locking issues...
Then the directory is moved.... one dentry on disk needs changed to
reflect the new name.
V***@vt.edu
2007-08-16 17:31:15 UTC
Permalink
Post by Phillip Susi
Post by V***@vt.edu
What happens if I do a 'mv /home /home1'? Looks like more than a "relatively
small" number. A cold-cache 'find' takes a few minutes to wade through it all,
so any solutions you come up with should beware of locking issues...
Then the directory is moved.... one dentry on disk needs changed to
reflect the new name.
That's how it works *now*.

That's *not* what happens with Marc's "Out of Box filesystem", because you need
to deal with the fact that the pathnames to everything under it just changed.
Phillip Susi
2007-08-16 22:03:09 UTC
Permalink
Post by V***@vt.edu
That's how it works *now*.
That's *not* what happens with Marc's "Out of Box filesystem", because you need
to deal with the fact that the pathnames to everything under it just changed.
I think you cross jumped subthreads. I have gone off on a different
topic now really and have been talking about acls being dynamically
inherited by children, not regular expression permission matching.
Kyle Moffett
2007-08-16 23:17:47 UTC
Permalink
Post by Phillip Susi
Post by Kyle Moffett
Let me repeat myself here: Algorithmically you fundamentally
CANNOT implement inheritance-based ACLs without one of the
following (although if you have some other algorithm in mind, I'm
(A) Some kind of recursive operation *every* time you change an
inheritable permission
(B) A unified "starting point" from which you begin *every*
access-control lookup (or one "starting point" per useful semantic
grouping, like a namespace).
The "(A)" is presently done in userspace and that's what you want
to avoid. As to (B), I will attempt to prove below that you
cannot implement "(B)" without breaking existing assumptions and
restricting a very nice VFS model.
No recursion is needed because only one acl exists, so that is the
only one you need to update. At least on disk. Any cached acls in
memory of descendant objects would need updated, but the number of
those should be relatively small. The starting point would be the
directory you start the lookup from. That may be the root, or it
may be some other directory that you have a handle to, and thus,
already has its effective acl computed.
Problem 1: "updating cached acls of descendent objects": How do you
find out what a 'descendent object' is? Answer: You can't without
recursing through the entire in-memory dentry tree. Such recursion
is lock-intensive and has poor performance. Furthermore, you have to
do the entire recursion as an atomic operation; other cross-directory
renames or ACL changes would invalidate your results halfway through
and cause race conditions.

Oh, and by the way, the kernel has no real way to go from a dentry to
a (process, fd) pair. That data simply is not maintained because it
is unnecessary and inefficent to do so. Without that data you
*can't* determine what is "dependent". Furthermore, even if you
could it still wouldn't work because you can't even tell which path
the file was originally opened via. Say you run:
mount --bind /mnt/cdrom /cdrom
umount /mnt/cdrom

Now any process which had a cwd or open directory handle in "/cdrom"
is STILL USING THE ACLs from when it was mounted as "/mnt/cdrom". If
you have the same volume bind-mounted in two places you can't easily
distinguish between them. Caching permission data at the vfsmount
won't even help you because you can move around vfsmounts as long as
they are in subdirectories:
mkdir -p /a/b/foo
mount -t tmpfs tmpfs /a/b/foo
mv /a/b /quux
umount /quux/foo

At this point you would also have to look at vfsmounts during your
recursive traversal and update their cached ACLs too.

Problem 2: "Some other directory that you have a handle to": When
you are given this relative path and this cwd ACL, how do you
determine the total ACL of the parent directory:
path: ../foo/bar
cached cwd total-ACL:
root rwx (inheritable)
bob rwx (inheritable)
somegroup rwx (inheritable)
jane rwx
".." partial-ACL
root +rwx (inheritable)
somegroup +rx (inheritable)

Answer: you can't. For example, if "/" had the permission 'root
+rwx (inheritable)', and nothing else had subtractive permissions,
then the "root +rwx (inheritable)" in the parent dir would be a no-
op, but you can't tell that without storing a complete parent
directory history.

Now assume that I "mkdir /foo && set-some-inheritable-acl-on /foo &&
mv /home /foo/home". Say I'm running all sorts of X apps and GIT and
a number of other programs and have some conservative 5k FDs open on /
home. This is actually something I've done before (without the
ACLs), albeit accidentally. With your proposal, the kernel would
first have to identify all of the thousands of FDs with cached ACL
data across a very large cache-hot /home directory. For each FD, it
would have to store an updated copy of the partial-ACL states down
its entire path. Oh, and you can't do any other ACL or rename
operations in the entire subtree while this is going on, because that
would lead to the first update reporting incorrect results and racing
with the second. You are also extremely slow, deadlock-prone, and
memory hungry, since you have to take an enormous pile of dentry
locks while doing the recursion. Nobody can even open files with
relative paths while this is going on because the cached ACLs are in
an intermediate and inconsistent state: they're updated but the
directory isn't in its new position yet.
Post by Phillip Susi
Post by Kyle Moffett
(1.a.I)
You just broke all sorts of chrooted daemons. When I start bind
chdir("/private/bind9");
chroot(".");
setgid(...);
setuid(...);
The "/private" directory is readable only by root, since root is
the only one who will be navigating you into these chroots for any
reason. You only switch UID/GID after the chroot() call, at which
point you are inside of a sub-context and your cwd is fully
accessible. If you stick an inheritable ACL on "/private", then
the "cwd" ACL will not allow access by anybody but root and my
bind won't be able to read any config files.
If you want the directory to be root accessible but the files
inside to have wider access then you set the acl on the directory
to have one ace granting root access to the directory, and one ace
that is inheritable granting access to bind. This latter ace does
not apply to the directory itself, only to its children.
This is completely opposite the way that permissions currently
operate in Linux. When I am chrooted, I don't care about the
permissions of *anything* outside of the chroot, because it simply
doesn't exist. Furthermore you still don't answer the "computing ACL
of parent directory requires lots of space" problem.
Post by Phillip Susi
Post by Kyle Moffett
You also break relative paths and directory-moving. Say a process
does chdir("/foo/bar"). Now the ACL data in "cwd" is appropriate
for /foo/bar. If you later chdir("../quux"), how do you unapply
the changes made when you switched into that directory? For
inheritable ACLs, you can't "unapply" such an ACL state change
unless you save state for all the parent directories, except...
What happens when you are in "/foo/bar" and another process does
"mv /foo/bar /foobar/quux"? Suddenly any "cwd" ACL data you have
is completely invalid and you have to rebuild your ACLs from
scratch. Moreover, if the directory you are in was moved to a
portion of the filesystem not accessible from your current
namespace then how do you deal with it?
Yes, if /foo/quux is not already cached in memory, you would have
to walk the tree to build its acl. /foo should already be cached
in memory so this work is minimal. Is this so horrible of a problem?
As for moving, it is handled the same way as any other event that
makes cwd go away, such as deleting it or revoking your access; cwd
is now invalid.
No, you aren't getting it: YOUR CWD DOES NOT GO AWAY WHEN YOU MOVE
IT OR UMOUNT -L IT. NEITHER DO OPEN DIRECTORY HANDLES. Sorry for
yelling but this is the crux of the point I am trying to make. Any
permissions system which cannot handle a *completely* discontiguous
filesystem space cannot work on Linux; end of story. The primary
reason behind that is all sorts of filesystem operations are
internally discontiguous because it makes them much more efficient.
By attempting to "force" the VFS to pretend like everything is
contiguous you are going to break horribly in a thousand different
corner cases that simply don't exist at the moment.
Post by Phillip Susi
Post by Kyle Moffett
NS1 has the / root dir of /dev/sdb1 mounted on /mnt
NS2 has the /bar subdir of /dev/sdb1 mounted on /mnt
Your process is in NS2 and does chdir("/mnt/quux"). A user in NS1
does: "mv /mnt/bar/quux /mnt/quux". Now your "cwd" is in a
directory on a filesystem you have mounted, but it does not
correspond *AT ALL* to any path available from your namespace.
Which would be no different than if they just deleted the entire
thing. Your cwd no longer exists.
No, your cwd still exists and is full of files. You can still
navigate around in it (same with any open directory handle). You can
still open files, chdir, move files, etc. There isn't even a way for
the process in NS1 to tell the processes in NS2 that its directories
were rearranged, so even a simple "NS1# mv /mnt/bar/a/somedir /mnt/
bar/b/somedir" is not going to work.
Post by Phillip Susi
Post by Kyle Moffett
Your process has done dirfd=open("/media/cdrom/somestuff") when
the admin does "umount -l /media/cdrom". You still have the CD-
ROM open and accessible but IT HAS NO PATH. It isn't even mounted
in *any* namespace, it's just kind of dangling waiting for its
last users to go away. You can still do fchdir(dirfd), openat
(dirfd, "foo/bar", ...), open("./foo"), etc.
What's this got to do with acls? If you are asking what effect the
umount thas on the acls of the cdrom, the answer is none. The acls
are on the disc and nothing on the disc has changed.
But you said above "Yes, if /foo/quux is not already cached in
memory, then you would have to walk the tree to build it's ACL". Now
assume that instead of "/foo/quux", you are one directory deep in the
now-unmounted CDROM and you try to open "../baz/quux". In order to
get at the ACL of the parent directory it has to have an absolute
path somewhere, but at that point it doesn't.
Post by Phillip Susi
Post by Kyle Moffett
No, this is correct because in the root directory "/", the ".."
entry is just another link to the root directory. So the absolute
path "/../../../../../.." is just a fancy name for the root
directory. The above jail-escape-as-root exploit is possible
because it is impossible to determine whether a directory is or is
not a subentry of another directory without an exhaustive search.
So when your "cwd" points to a path outside of the chroot, the one
special case in the code for the "root" directory does not ever
match and you can "chdir" all the way up to the real root. You
can even do an fstat() after every iteration to figure out whether
you're there or not!
Ohh, I see... yes... that is a very clever way for root to misuse
chroot(). What does it have to do with this discussion?
What it "has to do" is it is part of the Linux ABI and as such you
can't just break it because it's "inconvenient" for inheritable
ACLs. You also can't make a previously O(1) operation take lots of
time, as that's also considered "major breakage".
Post by Phillip Susi
Post by Kyle Moffett
With this you just got into the big-ugly-nasty-recursive-behavior
again. Say I untar 20 kernel source trees and then have my
program open all 1000 available FDs to various directories in the
kernel source tree. Now I run 20 copies of this program, one for
each tree, still well within my ulimits even on a conservative
box. Now run "mv dir_full_of_kernel_sources some/new/dir". The
only thing you can do to find all of the FDs is to iterate down
the entire subdirectory tree looking for open files and updating
their contexts one-by-one. Except you have 20,000 directory FDs
to update. Ouch.
Ok, so you found a pedantic corner case that is slow. So? And it
is still going to be faster than chmod -R.ee
"Pedantic corner case"? You could do the same thing even *WITHOUT*
all the processes holding open FDs, you would still have to iterate
over the entire in-cache portion of the subtree in order to verify
that there are no open FDs on it. Yet again you would also run into
the problem that we don't have *ANY* dentry-to-filehandle mapping in
the kernel.
Post by Phillip Susi
Post by Kyle Moffett
To sum up, when doing access control the only values you can
(A) The dentry/inode
(B) The superblock
(C) *Maybe* the vfsmount if those patches get accepted
Any access control model which tries to poke other values is just
going to have a shitload of corner cases where it just falls over.
If by falls over you mean takes some time, then yes.... so what?
Converting a previously O(1) operation into an O(number-of-subdirs)
operation is also known as "a major regression which we don't do a
release till we get it fixed". For boxes where O(number-of-subdirs)
numbers in the millions that would make it slow to a painful crawl.

By the way, I'm done with this discussion since you don't seem to be
paying attention at all. Don't bother replying unless you've
actually written testable code you want people on the list to look
at. I'll eat my own words if you actually come up with an algorithm
which works efficiently without introducing regressions.

Cheers,
Kyle Moffett
Marc Perkel
2007-08-17 04:24:22 UTC
Permalink
Several people have asked about how to mass move a
tree under my idea for a new kind of file system. I
have an idea. Suppose you have the file name as
follies.

/one/two/three/four/file1

Except the are a million files in /four/ named file1
to file1000000.

We want to move thes files to /seven/six/five. How do
you do that fast? Here's an idea. Suppose you have a
hash not only of files but of the directory sections.
Every new section is added and givem a number. For
simplicity the table might look like this:

one 1
two 2
three 3
four 4
five 5
six 6
seven 7

So what the path is reduced to is /1/2/3/4 which point
to those names. So if you want to move it then you
change the names in the database.

seven 1
six 2
five 3

And then everything that's stored as /1/2/3/4 is still
the same but the sections resolve to different names.

I'm sure there are errors in my logic but I'm trying
to show that if you are persistent in trying to come
up with ideas on how to do something you will
eventually make it work. But if you are looking for
ways to make it not work then you probably won't solve
it.

So the correct response to this message isn't to prove
that my method won't work, but to come up with a
method that will work. You have to look for a solution
rather than attack other people's solutions.

That's what thinking outside the box means.

Impossible = Challenge


Marc Perkel
Junk Email Filter dot com
http://www.junkemailfilter.com


____________________________________________________________________________________
Park yourself in front of a world of choices in alternative vehicles. Visit the Yahoo! Auto Green Center.
http://autos.yahoo.com/green_center/
V***@vt.edu
2007-08-17 04:52:51 UTC
Permalink
Post by Marc Perkel
And then everything that's stored as /1/2/3/4 is still
the same but the sections resolve to different names.
At that point, you need to go re-think your full-pathname permission scheme,
because that severely broke it.
Post by Marc Perkel
I'm sure there are errors in my logic but I'm trying
to show that if you are persistent in trying to come
up with ideas on how to do something you will
eventually make it work.
Not true - there are *large* fields where *no* amount of persistence will
make it work - squaring a circle or trisecting an angle by geometric means,
perpetual motion machines, and many broken-as-designed computer programs.
Post by Marc Perkel
But if you are looking for
ways to make it not work then you probably won't solve
it.
Right. It's not *our* job to solve your problem. It's *our* job to find
ways to not make it work - those are called *bugs*. We try very hard to keep
buggy code and ideas out of the kernel. So we make it difficult for bugs.
Post by Marc Perkel
So the correct response to this message isn't to prove
that my method won't work, but to come up with a
You're the one proposing something. *You* get to come up with something
that works. It's *our* job to review and comment and point out problems.
You want to get something to happen, you take the comments, and use them
to improve the result. We point out every possible issue we can find (and
maybe suggest ways to fix them). That's what code review is about.

And if you think people like me and Kyle are rough on you, I suggest you go
back in the archives and read what people like Al Viro and Chris Hellwig
had to say about ReiserFS4. And *that* was not just a hand-wave like
you're doing, that was an actual *working* filesystem.
Post by Marc Perkel
method that will work. You have to look for a solution
rather than attack other people's solutions.
No, *you* have to look for a solution that is good enough to withstand the
combined scrutiny of everybody on the linux-kernel list. That's why Linux
works, and why we're able to have several *million* lines of changes between
2.6.2N and 2.6.2(N+1), and only have several dozen things on the "known
regressions" list.
Post by Marc Perkel
That's what thinking outside the box means.
Impossible = Challenge
Good. Let us know when you come up with something that actually works.
Phillip Susi
2007-08-17 15:19:21 UTC
Permalink
Post by Kyle Moffett
Problem 1: "updating cached acls of descendent objects": How do you
find out what a 'descendent object' is? Answer: You can't without
recursing through the entire in-memory dentry tree. Such recursion is
lock-intensive and has poor performance. Furthermore, you have to do
the entire recursion as an atomic operation; other cross-directory
renames or ACL changes would invalidate your results halfway through and
cause race conditions.
Yes, it would take some cpu time, and yes, it would have to use a lock
to protect the acl which would also lock out moves. Is that such a high
cost? Changing acls and moving whole directory trees around is not THAT
common of an operation... if it takes a wee bit more cpu time, I doubt
anyone will complain.
Post by Kyle Moffett
Oh, and by the way, the kernel has no real way to go from a dentry to a
(process, fd) pair. That data simply is not maintained because it is
unnecessary and inefficent to do so. Without that data you *can't*
What would you need (process,fd) for? You just need to walk the tree of
dentries and update them.
Post by Kyle Moffett
determine what is "dependent". Furthermore, even if you could it still
wouldn't work because you can't even tell which path the file was
mount --bind /mnt/cdrom /cdrom
umount /mnt/cdrom
Now any process which had a cwd or open directory handle in "/cdrom" is
STILL USING THE ACLs from when it was mounted as "/mnt/cdrom". If you
have the same volume bind-mounted in two places you can't easily
distinguish between them. Caching permission data at the vfsmount won't
even help you because you can move around vfsmounts as long as they are
mkdir -p /a/b/foo
mount -t tmpfs tmpfs /a/b/foo
mv /a/b /quux
umount /quux/foo
At this point you would also have to look at vfsmounts during your
recursive traversal and update their cached ACLs too.
Each bind mount point would need its own dentry so it could have its own
acl and parent pointer.
Post by Kyle Moffett
Problem 2: "Some other directory that you have a handle to": When you
are given this relative path and this cwd ACL, how do you determine the
path: ../foo/bar
root rwx (inheritable)
bob rwx (inheritable)
somegroup rwx (inheritable)
jane rwx
".." partial-ACL
root +rwx (inheritable)
somegroup +rx (inheritable)
Answer: you can't. For example, if "/" had the permission 'root +rwx
(inheritable)', and nothing else had subtractive permissions, then the
"root +rwx (inheritable)" in the parent dir would be a no-op, but you
can't tell that without storing a complete parent directory history.
What? The total acl of the parent directory is in its dentry.
Post by Kyle Moffett
Now assume that I "mkdir /foo && set-some-inheritable-acl-on /foo && mv
/home /foo/home". Say I'm running all sorts of X apps and GIT and a
number of other programs and have some conservative 5k FDs open on
/home. This is actually something I've done before (without the ACLs),
albeit accidentally. With your proposal, the kernel would first have to
identify all of the thousands of FDs with cached ACL data across a very
large cache-hot /home directory. For each FD, it would have to store an
updated copy of the partial-ACL states down its entire path. Oh, and
you can't do any other ACL or rename operations in the entire subtree
while this is going on, because that would lead to the first update
reporting incorrect results and racing with the second. You are also
extremely slow, deadlock-prone, and memory hungry, since you have to
take an enormous pile of dentry locks while doing the recursion. Nobody
can even open files with relative paths while this is going on because
the cached ACLs are in an intermediate and inconsistent state: they're
updated but the directory isn't in its new position yet.
Again, such a move would take some cpu time but the locks would not
block file opens, just other moves and permission changes. I don't
think this burden is terribly high for what is a relatively rare
operation.
Post by Kyle Moffett
This is completely opposite the way that permissions currently operate
in Linux. When I am chrooted, I don't care about the permissions of
*anything* outside of the chroot, because it simply doesn't exist.
Yes, it is different... if it wasn't we wouldn't be talking about it.
Post by Kyle Moffett
Furthermore you still don't answer the "computing ACL of parent
directory requires lots of space" problem.
What problem?
Post by Kyle Moffett
No, you aren't getting it: YOUR CWD DOES NOT GO AWAY WHEN YOU MOVE IT
OR UMOUNT -L IT. NEITHER DO OPEN DIRECTORY HANDLES. Sorry for yelling
It effectively does if you chmod 000 it. Same idea. You yank
permissions to cwd, then you can't access cwd anymore.
Post by Kyle Moffett
but this is the crux of the point I am trying to make. Any permissions
system which cannot handle a *completely* discontiguous filesystem space
cannot work on Linux; end of story. The primary reason behind that is
all sorts of filesystem operations are internally discontiguous because
it makes them much more efficient. By attempting to "force" the VFS to
pretend like everything is contiguous you are going to break horribly in
a thousand different corner cases that simply don't exist at the moment.
Not sure what you mean here.
Post by Kyle Moffett
No, your cwd still exists and is full of files. You can still navigate
around in it (same with any open directory handle). You can still open
files, chdir, move files, etc. There isn't even a way for the process
in NS1 to tell the processes in NS2 that its directories were
rearranged, so even a simple "NS1# mv /mnt/bar/a/somedir
/mnt/bar/b/somedir" is not going to work.
No, like above, your example is equivalent to either an rm -fr or a
chmod 000 of cwd. That means you either no longer can use cwd, or it is
now empty.

Whether it is with chmod or by moving changing the inherited acls, if
you loose access to cwd, then you can no longer access cwd.
Post by Kyle Moffett
But you said above "Yes, if /foo/quux is not already cached in memory,
then you would have to walk the tree to build it's ACL". Now assume
that instead of "/foo/quux", you are one directory deep in the
now-unmounted CDROM and you try to open "../baz/quux". In order to get
at the ACL of the parent directory it has to have an absolute path
somewhere, but at that point it doesn't.
It isn't unmounted yet, it is just disconnected from the tree, so it
would still be using the same acl it had prior to the disconnect.
Post by Kyle Moffett
What it "has to do" is it is part of the Linux ABI and as such you can't
just break it because it's "inconvenient" for inheritable ACLs. You
also can't make a previously O(1) operation take lots of time, as that's
also considered "major breakage".
It isn't inconvenient at all for inheritable acls, nor would chroot or
chdir be any slower.
Post by Kyle Moffett
"Pedantic corner case"? You could do the same thing even *WITHOUT* all
the processes holding open FDs, you would still have to iterate over the
entire in-cache portion of the subtree in order to verify that there are
no open FDs on it. Yet again you would also run into the problem that
we don't have *ANY* dentry-to-filehandle mapping in the kernel.
Again, don't care about filehandles. The only thing you have to do is
walk the dentries and update them. No different than a chmod -R or
setfacl -R, except of course, you only have to walk the dentries already
in memory, not update anything on disk.
Post by Kyle Moffett
Converting a previously O(1) operation into an O(number-of-subdirs)
operation is also known as "a major regression which we don't do a
release till we get it fixed". For boxes where O(number-of-subdirs)
numbers in the millions that would make it slow to a painful crawl.
We aren't talking about the scheduler or something here. We are talking
about an optional feature that doesn't slow things down if you don't
bother using it. If you aren't trying to move a super massive directory
tree which inherits permissions ( you can flag objects to NOT inherit
from their parent ) and has millions of cached dentries, then there is
no slowdown.
Post by Kyle Moffett
By the way, I'm done with this discussion since you don't seem to be
paying attention at all. Don't bother replying unless you've actually
written testable code you want people on the list to look at. I'll eat
my own words if you actually come up with an algorithm which works
efficiently without introducing regressions.
Testy testy... and why is it that the standard cop out on this list is
"code up or shut up"? If you don't want to discuss the idea, then don't...
V***@vt.edu
2007-08-17 15:39:43 UTC
Permalink
Post by Phillip Susi
Post by Kyle Moffett
Problem 1: "updating cached acls of descendent objects": How do you
find out what a 'descendent object' is? Answer: You can't without
recursing through the entire in-memory dentry tree.
I suspect Kyle is not quite correct - it's probably the case that you don't
have to consider just the in-memory dentries, but *all* the descendent objects
in the entire file system.

If you have a clever proof that on-disk can't *possibly* be affected, feel
free to present it.

(Does anybody know offhand what means 'chacl -r' uses to avoid race conditions
with directories being moved in/out from under it, or does it just say "we'll
make a best stab at it"?)
Post by Phillip Susi
Yes, it would take some cpu time, and yes, it would have to use a lock
to protect the acl which would also lock out moves. Is that such a high
cost? Changing acls and moving whole directory trees around is not THAT
common of an operation... if it takes a wee bit more cpu time, I doubt
anyone will complain.
It will become even *more* of a "not that common" if the lock will block moves
and ACL changes *across the filesystem* for potentially *minutes* at a time.
Phillip Susi
2007-08-17 19:01:48 UTC
Permalink
Post by V***@vt.edu
I suspect Kyle is not quite correct - it's probably the case that you don't
have to consider just the in-memory dentries, but *all* the descendent objects
in the entire file system.
If you have a clever proof that on-disk can't *possibly* be affected, feel
free to present it.
Why would you have to consider the descendent entries on disk when you
are only changing an entry in the parent? The effects of that change
are only computed in memory when the dentry for a child is created, so
you don't have to do a bunch of disk churning to change permissions on
the whole tree. In fact, all of the children may very well have NO acl
of their own stored on disk, which also saves space.

The whole idea here is that there is ONE acl that applies to the whole
tree, rather than have every object in the tree have its own acl.
That's why every object in the tree on the disk is not effected by a
change.
Post by V***@vt.edu
It will become even *more* of a "not that common" if the lock will block moves
and ACL changes *across the filesystem* for potentially *minutes* at a time.
It will not take anywhere NEAR minutes at a time to update the in memory
dentries, more like 50ms.
Kyle Moffett
2007-08-18 05:48:02 UTC
Permalink
Post by Phillip Susi
Post by V***@vt.edu
It will become even *more* of a "not that common" if the lock will
block moves and ACL changes *across the filesystem* for
potentially *minutes* at a time.
It will not take anywhere NEAR minutes at a time to update the in
memory dentries, more like 50ms.
One last comment:

50ms to update in-memory dentries would be FRIGGING TERRIBLE!!!
Using Perl, an interpreted language, the following script takes 3.39s
to run on one of my lower-end systems:

for (0 .. 10000) {
mkdir "a-$_";
mkdir "b-$_";
rename "a-$_", "b-$_";
}

It's not even deleting things afterwards so it's populating a
directory with ten thousand entries. We can easily calculate
10,000/3.39 = 2,949 entries per second, or 0.339 milliseconds per entry.

When I change it to rmdir things instead, the runtime goes down to
2.89s == 3460 entries/sec == 0.289 milliseconds per entry.

If such a scheme even increases the overhead of a directory rename by
a hundredth of a millisecond on that box it would easily be a 2-3%
performance hit. Given that people tend to kill for 1% performance
boosts, that's not likely to be a good idea.

Cheers,
Kyle Moffett
Marc Perkel
2007-08-18 16:45:54 UTC
Permalink
Post by V***@vt.edu
Post by Phillip Susi
Post by V***@vt.edu
It will become even *more* of a "not that common"
if the lock will
Post by Phillip Susi
Post by V***@vt.edu
block moves and ACL changes *across the
filesystem* for
Post by Phillip Susi
Post by V***@vt.edu
potentially *minutes* at a time.
It will not take anywhere NEAR minutes at a time
to update the in
Post by Phillip Susi
memory dentries, more like 50ms.
50ms to update in-memory dentries would be FRIGGING
TERRIBLE!!!
Using Perl, an interpreted language, the following
script takes 3.39s
for (0 .. 10000) {
mkdir "a-$_";
mkdir "b-$_";
rename "a-$_", "b-$_";
}
It's not even deleting things afterwards so it's
populating a
directory with ten thousand entries. We can easily
calculate
10,000/3.39 = 2,949 entries per second, or 0.339
milliseconds per entry.
When I change it to rmdir things instead, the
runtime goes down to
2.89s == 3460 entries/sec == 0.289 milliseconds per
entry.
If such a scheme even increases the overhead of a
directory rename by
a hundredth of a millisecond on that box it would
easily be a 2-3%
performance hit. Given that people tend to kill for
1% performance
boosts, that's not likely to be a good idea.
Cheers,
Kyle Moffett
What I suggested was a concept of a new way to look at
a file system. What you are arguing here is why it
wouldn't work based on your theories as to how such a
file system would be implemented. In attacking how
slow you think it might be you are making assumptions
that wouldn't apply to how this would be implemented.
You are assuming that it would be implemented in ways
that you are familiar with. That is a wrong
assumption.

Linux isn't going to make progress when people try to
figure out how to make something NOT work rather than
to make something work. So if you are going to put
effort into this then why not try to figure out how to
get around the issues you are raising rather than to
attack the idea as unsolvable.

When I originally suggested that the names would be a
"hash" I didn't mean that it is going to be only a
hash. You have successfully argued that just a hash
would have problems. Which means that a real solution
is going to be more complex.

I suggest that it would be easier to figure out how to
make moves of large directory structure fast and
effecient with automatic inheritance of rights.

I know it can be done because Microsoft is doing it
and Novell Netware was doing it 20 years ago. So the
fact that it is done by others disproves your
arguments that it can't be done.



Marc Perkel
Junk Email Filter dot com
http://www.junkemailfilter.com



____________________________________________________________________________________
Moody friends. Drama queens. Your life? Nope! - their life, your story. Play Sims Stories at Yahoo! Games.
http://sims.yahoo.com/
Al Viro
2007-08-18 18:19:59 UTC
Permalink
Post by Marc Perkel
Linux isn't going to make progress when people try to
figure out how to make something NOT work rather than
to make something work. So if you are going to put
effort into this then why not try to figure out how to
get around the issues you are raising rather than to
attack the idea as unsolvable.
It's your idea; _you_ get to defend it against the problems found by
reviewers. And whining about negativity is the wrong way to do that.
Look at it that way: there is science and there is feel-good woo.
The former depends on peer review. The latter depends on not having
it and vague handwaving is the classical way of avoiding it. So are
the claims of being a "visionary" and accusing critics of being uncooperative
reactionaries conspiring against the progress.

So far you are doing very poorly; if you want somebody else to join
you in experimenting with these ideas, you are acting in a very
inefficient way (and if you don't want anybody else, you'll obviously
have to deal with details yourself anyway). Asserting that critics
should patch the holes in your handwaving is unlikely to impress anybody;
arrogance is not in short supply around here and yours is not even
original.
Marc Perkel
2007-08-19 04:07:05 UTC
Permalink
No Al, there isn't any shortage of arrogance here.

Let me try to repeat what I'm talking about as simply
as I can.

First - I'm describing a kind of functionality and
suggesting Linux should have it. I know a lot of it
can be done because much of what I'm suggesting is
already working in Windows and Netware.

I'm not the one who's going to code it. I'm just
saying that it would be nice if Linux had the
functionality of other operating systems - and - take
it to the next level - match it and do even better.

As to thinking outside the box, what I'm proposing is
outside the box relative to Linux. It's not as
original as compared to Windows or Netware which is
even better.

The idea is that Linux is lacking features that other
OSs have. What I'm suggesting is that Linux not only
match it but to create an even more powerful rights
layer that is more powerful than the rest and I'm
outlining a concept in the hopes that people would get
excited about the concept and want to build on the
idea.

I'm just telling you what I'd like to see. I'm not
going to code it. So I'm only going to talk about what
is possible. How it's done will be up to any
programmers who might be inspired by the idea. If no
one is inspired the Linux will continue to be in last
place when it comes to file system features relating
to fine grain permissions.

In Linux, for example, users are allowed to delete
files that they are prohibited from reading or
writing. In Netware if a user can't read or write to
the file they won't even be able to see that the file
exists, let alone delete it.

In Netware I can move a directory tree into another
tree and the objects that have rights in the other
tree will have rights to all the new files without
having to run utilities on the command line to
recursively change the permission afterwards.

The point - Linux isn't going to move forward and
catch up unless there is a fundamental change in the
thinking behind Linux permissions. There is a
cultural lack of innovation here. I discussed this
with Andrew Morton and he made some suggestions but
there's real hostility towards new concepts here.
Something I don't understand. At some point Linux
needs to grow beyond just being an evolved Unix clone
and that's not going to happen if you don't think
differently.

I still believe that the VI editor causes brain
damage. :)




Marc Perkel
Junk Email Filter dot com
http://www.junkemailfilter.com


____________________________________________________________________________________
Park yourself in front of a world of choices in alternative vehicles. Visit the Yahoo! Auto Green Center.
http://autos.yahoo.com/green_center/
Nix
2007-08-20 07:05:09 UTC
Permalink
[Lurker delurking here: I've seen Marc pull this sort of trick on
multiple mailing lists now and I've had enough. Coming up with
wild ideas, thinking they must be ideas nobody else has ever
considered, and never thinking them through for five minutes is
sort of Marc's hallmark, I'm afraid.]
Post by Marc Perkel
No Al, there isn't any shortage of arrogance here.
Actually Al's being unusually nice to you. A regular would have had a
few lines of `you must be kidding' but he's trying to get you to see
*why* you are wrong.

(It's not working.)
Post by Marc Perkel
Let me try to repeat what I'm talking about as simply
as I can.
Precision is better.
Post by Marc Perkel
First - I'm describing a kind of functionality and
suggesting Linux should have it. I know a lot of it
can be done because much of what I'm suggesting is
already working in Windows and Netware.
Windows and Netware don't have multiple namespaces to cope with
(e.g. chroot()); nor do they have the ability to detach parts of their
filesystem tree completely from other parts (so they're not even
connected at the root).

Their VFSes are much less capable than Linux's.

ACLs provide some of the inheritability you want anyway.


In regards to the general idea of doing permission checking in
userspace: that works to some extent --- see FUSE --- but you really
*cannot* allow the entity who specifies permissions check semantics on
the filesystem to have less privileges than the entity who mounted the
filesystem in the first place. It's safe for that entity to specify such
things because it can read the device the filesystem is mounted from
anyway, thus overriding all permissions if it wants to: but any entity
of lesser privilege could use this as a really cheap privilege
escalation attack. (This is fundamental and unavoidable and no amount
of thinking outside any sort of box will fix it.)
Post by Marc Perkel
I'm not the one who's going to code it. I'm just
Might I suggest then that coming here and bloviating is not the right
approach? In the Linux world there are two ways to proceed: doing the
work yourself or paying someone to. You can sometimes get lucky and find
that someone else has already done what you were thinking of, but it is
vanishingly rare to have an idea so cool that someone else catches fire
and does all the work for you. In fact I can't recall ever seeing it
happen, although I suppose it might among longtime collaborators some of
whom are very overloaded with other things.

So you could make this a mount option, I suppose, but nothing more
pluggable than that.

(I don't see how your userspace permissions layer will avoid calling
back into the VFS, either: take care to avoid infloops when you code
it.)
Post by Marc Perkel
saying that it would be nice if Linux had the
functionality of other operating systems - and - take
it to the next level - match it and do even better.
`The next level'? What `next level'?
Post by Marc Perkel
As to thinking outside the box, what I'm proposing is
outside the box relative to Linux. It's not as
original as compared to Windows or Netware which is
even better.
Actually neither of those OSes provide regex-based permissions checking
that I know of. It's not a fundamentally silly idea but if you'd been
paying even the slightest attention to this list or one of its summaries
(like LWN's) over the last, oh, three years you'd have noticed the
difficulty winning people over that security modules that do depend on
name matching are experiencing. And AppArmor has clear semantics,
working code and actual users, none of which your idea has got.

(And in regard to your idea being so startlingly original in the Linux
world, well, you're not quite a decade behind the times.)
Post by Marc Perkel
The idea is that Linux is lacking features that other
OSs have. What I'm suggesting is that Linux not only
match it but to create an even more powerful rights
layer that is more powerful than the rest and I'm
I see no sign of significant extra power in what you've mentioned so
far: only extra ways to create security holes and/or infinite
regressions.
Post by Marc Perkel
outlining a concept in the hopes that people would get
excited about the concept and want to build on the
idea.
It's not working.
Post by Marc Perkel
I'm just telling you what I'd like to see. I'm not
going to code it. So I'm only going to talk about what
is possible.
Anyone on this list is capable enough to have thought of this one
themselves, and generally to have seen the holes as well. You don't even
seem to have *looked* for holes, which is strange given that this is
pretty much the most important job of anyone who's just thought of a new
idea.
Post by Marc Perkel
one is inspired the Linux will continue to be in last
place when it comes to file system features relating
to fine grain permissions.
Yay, very crude moral blackmail (well, it would be moral blackmail if
what you were saying was in any way persuasive). Way to inspire people!
Post by Marc Perkel
In Linux, for example, users are allowed to delete
files that they are prohibited from reading or
writing. In Netware if a user can't read or write to
the file they won't even be able to see that the file
exists, let alone delete it.
This implies that you're allowed to have multiple files of the same name
in the same directory. I don't think so. i.e. you *can* see that the
file exists by trying to create one of the same name and watching it
inexplicably fail. i.e. this measure does not actually provide increased
security against a determined attacker. i.e., this measure is largely
useless.
Post by Marc Perkel
In Netware I can move a directory tree into another
tree and the objects that have rights in the other
tree will have rights to all the new files without
having to run utilities on the command line to
recursively change the permission afterwards.
That works until you have chroot() or detachable namespaces: and saying
`oh, nobody uses those' is irrelevant: *at any time* any chunk of the
namespace might get detached and go haring off on its own (as it were),
and permissions checks under the detached piece had better not change
semantics should that happen.
Post by Marc Perkel
The point - Linux isn't going to move forward and
catch up unless there is a fundamental change in the
thinking behind Linux permissions. There is a
All fundamental changes in this area require *very* careful thinking
about the security implications. Complexity increases are also something
to avoid if possible because they make the model harder to think about:
and even slight increases can have large effects. (Witness all the pain
that the addition of two-or-is-it-three extra permissions sets brought
with POSIX capabilities: multiple security holes until they got
effectively disabled back in 2.2.x. They're still not back on although
with luck that may be changing.)
Post by Marc Perkel
cultural lack of innovation here. I discussed this
There is entirely justified *paranoia* about changing security
semantics, combined with a lack of desire to do all your working *and*
thinking for you.
Post by Marc Perkel
with Andrew Morton and he made some suggestions but
there's real hostility towards new concepts here.
There's rather less hostility when there's working code (but still quite
a bit, witness AppArmor).
Post by Marc Perkel
Something I don't understand. At some point Linux
needs to grow beyond just being an evolved Unix clone
and that's not going to happen if you don't think
differently.
Sorry, Linux *is* a Unix clone. If you want an experimental OS, well, git
makes it easy to fork. However that will mean doing the work yourself.
Post by Marc Perkel
I still believe that the VI editor causes brain
damage. :)
I hasten to add that using Emacs doesn't *always* cause an inability to
analyze one's ideas.
Brennan Ashton
2007-08-20 07:47:10 UTC
Permalink
While i highly support innovation, until i see a well layed out
structure of what exactly you are looking for i have a hard time
expressing any view that are meaningful, could you create some kind of
wiki or summery email (if this is really this important to you) most
of us are lazy and have better things to do, make it easy for us.
A) Create a list of the current problems or just inefficiencies in
the current system.
B) Create a list of all the points that make up your view of a good
file system.
C) Cross the two lists showing how your idea would fix the current problems.

I am not saying that the current way is the right or wrong way, just
that i think you have organised your ideas as if you are thinking out
loud by email (which is ok by me, just stop the direct attacks if you
are).
I agree that every company and program gets caught in a rut, that does
not conform to changing markets and technology, especially if it was
at one time a success, IBM, Microsoft, Apple, Sun, American Auto
Industry to name a few. These are also example companies that have
gotten the idea that what they do might not be right way and have made
some attempt to step back (some more successfully than others) or face
loss in market share.
Just remember a key point when rethinking something as key as a file
system, while your new battery may be much more efficient than my old
AA, i want it to work with my old flashlights too (and no aftermarket
refit kit).
Should the surroundings be modified for the target, or the target
modified for the surroundings?
Your little rants about VI and rm are not helpful, if these programs
were so bad then why have they survived. Linux is on hell of a project
to be put together, sorry but innovation did come from people using VI
and Emacs. btw i highly recommend the command man, you should try it.
--
Brennan Ashton
Bellingham, Washington

"The box said, 'Requires Windows 98 or better'. So I installed Linux"
Marc Perkel
2007-08-20 11:18:21 UTC
Permalink
Post by Brennan Ashton
While i highly support innovation, until i see a
well layed out
structure of what exactly you are looking for i have
a hard time
expressing any view that are meaningful, could you
create some kind of
wiki or summery email (if this is really this
important to you) most
of us are lazy and have better things to do, make it
easy for us.
A) Create a list of the current problems or just
inefficiencies in
the current system.
B) Create a list of all the points that make up
your view of a good
file system.
C) Cross the two lists showing how your idea would
fix the current problems.
I am not saying that the current way is the right or
wrong way, just
that i think you have organised your ideas as if you
are thinking out
loud by email (which is ok by me, just stop the
direct attacks if you
are).
I agree that every company and program gets caught
in a rut, that does
not conform to changing markets and technology,
especially if it was
at one time a success, IBM, Microsoft, Apple, Sun,
American Auto
Industry to name a few. These are also example
companies that have
gotten the idea that what they do might not be right
way and have made
some attempt to step back (some more successfully
than others) or face
loss in market share.
Just remember a key point when rethinking something
as key as a file
system, while your new battery may be much more
efficient than my old
AA, i want it to work with my old flashlights too
(and no aftermarket
refit kit).
Should the surroundings be modified for the target,
or the target
modified for the surroundings?
Your little rants about VI and rm are not helpful,
if these programs
were so bad then why have they survived. Linux is on
hell of a project
to be put together, sorry but innovation did come
from people using VI
and Emacs. btw i highly recommend the command man,
you should try it.
--
Brennan Ashton
Bellingham, Washington
"The box said, 'Requires Windows 98 or better'. So I
installed Linux"
What's the point? People are openly hostile to new
ideas here. I started out nice and laid out my ideas
and you have a bunch of morons who attack anything
new.

At least finally someone fixed the RM problem.

Look at the reality of the situation. Linux is free
and yey it can't compete with operating systems that
are paid for. Maybe the reason is that when someone
point out the something is broken all yopu get is
justification and excuses and insults.

Read the thread from the beginning and you'll see that
I started out that way.

If you attack people who are pointing out flaws and
making suggestions then people will stop pointing out
flaws and making suggestions.

Think about it. Why did it take 20 years for Linux to
fix the RM problem? If you type RM * you expect the
files to be gone, not some stupid error that I'm
trying to delete too many files.

So who's fault is that? I say it's a problem with
Linux culture. If something is broken you have to
justify it instead of fixing it.

If developers take that kind of attitude then progress
stops.

You guys are trying to may the RM problem MY FAULT
because I didn't say it nicely. We'll it doesn't have
to be said nicely. If something is broken then it
needs fixed regardless of who and how it is pointed
out.

A BUG is a BUG is a BUG. You fix bugs, not make
excuses and try to explain it away. If you went up to
any computer user and asked them if when they type "rm
*" that they expect the files to be deleted they will
say "yes". Yet in the Linux work the command doesn't
work. And it's not like it breaks after MILLIONS of
files. It breaks on just a few thousand files, if
that.

So wat does it tell you when something like this is
left broken for so long? What it tells me is that the
development process is broken.

My rant on VI is to make a point. That point being
that when you use an editor that totally sucks then
it's going to cause you to write code that sucks. It
going to lower your standards. It's going to create a
culture where poorly done work is considered
acceptable. When you use an editor as poor as vi then
the idea that rm * doesn't work becomes acceptable and
justifiable, as demonstrated here by people who
ACTUALLY DEFENDED IT.



Marc Perkel
Junk Email Filter dot com
http://www.junkemailfilter.com


____________________________________________________________________________________
Shape Yahoo! in your own image. Join our Network Research Panel today! http://surveylink.yahoo.com/gmrs/yahoo_panel_invite.asp?a=7
linux-os (Dick Johnson)
2007-08-20 13:32:58 UTC
Permalink
On Mon, 20 Aug 2007, Marc Perkel wrote:
[Snipped...]
Post by Marc Perkel
What's the point? People are openly hostile to new
ideas here. I started out nice and laid out my ideas
and you have a bunch of morons who attack anything
new.
[Snopped...]
Post by Marc Perkel
Marc Perkel
Junk Email Filter dot com
http://www.junkemailfilter.com
No. You started out by saying you have this great idea
that you want somebody else to develop. This met with
strong resistance of course, because linux-kernel
developers don't work for you.

If you were to create some new "thinking out-of-the-box"
file system, you could include it into Linux just as all
the other file-systems have been included (it started
with minix). If it was truly useful, it might even get
included into the main-line distributions if you
maintained it.

People are not openly hostile to new ideas. They are openly
hostile to being abused.

Cheers,
Dick Johnson
Penguin : Linux version 2.6.22.1 on an i686 machine (5588.29 BogoMips).
My book : http://www.AbominableFirebug.com/
_


****************************************************************
The information transmitted in this message is confidential and may be privileged. Any review, retransmission, dissemination, or other use of this information by persons or entities other than the intended recipient is prohibited. If you are not the intended recipient, please notify Analogic Corporation immediately - by replying to this message or by sending an email to ***@analogic.com - and destroy all copies of this information, including any attachments, without reading or disclosing them.

Thank you.
Lennart Sorensen
2007-08-20 15:25:12 UTC
Permalink
Post by Marc Perkel
Look at the reality of the situation. Linux is free
and yey it can't compete with operating systems that
are paid for. Maybe the reason is that when someone
point out the something is broken all yopu get is
justification and excuses and insults.
And when someone points out windows is broken, Microsoft doesn't give a
damn and you can't do a thing about it.
Post by Marc Perkel
Read the thread from the beginning and you'll see that
I started out that way.
If you attack people who are pointing out flaws and
making suggestions then people will stop pointing out
flaws and making suggestions.
Just because you think they are flaws doesn't mean they are. Your
complaint of rm * certainly fits that. You just didn't know what you
were doing or you would have realized rm * was the wrong thing to use.
Post by Marc Perkel
Think about it. Why did it take 20 years for Linux to
fix the RM problem? If you type RM * you expect the
files to be gone, not some stupid error that I'm
trying to delete too many files.
There is no problem with rm *. The wildcard expansion is done by the
shell, and different shells have different maximum command line lenghts
(it would be pretty inefficient to allow infinite command lines).
Windows makes the stupid choice that every single program has to invent
and support wildcard expansion on its own. This means most commands on
windows don't support wildcards at all. At least doing it in the shell
makes unix consistent and much more flexible for the user. In a few
cases you may have a wildcard that expands to too much. Well that is
why we have find and xargs and such. It was a known limitation, and an
efficient solution was invented to deal with the unusual case of too
many files matching a wildcard. Not a problem, nothing to fix.
Post by Marc Perkel
So who's fault is that? I say it's a problem with
Linux culture. If something is broken you have to
justify it instead of fixing it.
The user was at fault for doing something stupid and not using the right
tools for the job.
Post by Marc Perkel
If developers take that kind of attitude then progress
stops.
You prefer the developer telling you that you have to not use wildcards
and type all filesnames manually (as windows would for most commands),
or that instead they make the system allocate all system ram for your
command line just in case you decide you should delete 2 billion files
at once with rm *?
Post by Marc Perkel
You guys are trying to may the RM problem MY FAULT
because I didn't say it nicely. We'll it doesn't have
to be said nicely. If something is broken then it
needs fixed regardless of who and how it is pointed
out.
A BUG is a BUG is a BUG. You fix bugs, not make
excuses and try to explain it away. If you went up to
any computer user and asked them if when they type "rm
*" that they expect the files to be deleted they will
say "yes". Yet in the Linux work the command doesn't
work. And it's not like it breaks after MILLIONS of
files. It breaks on just a few thousand files, if
that.
Not a bug. A design decision. A sensible one at that, and one to which
a workaround exists (which happens to be faster than rm * too).

--
Len Sorensen
Helge Hafting
2007-08-20 15:26:33 UTC
Permalink
Post by Marc Perkel
What's the point? People are openly hostile to new
ideas here. I started out nice and laid out my ideas
and you have a bunch of morons who attack anything
new.
People here are not hostile to any new idea. They are
generally hostile to anyone who suggest some
"improvement" that he wants others to actually implement.
Because that is acting like a boss, and you're not the boss here.

Also - what do you expect when you bring an idea and say
"well, don't look at what might go wrong, that's just
limitations of the existing parts of linux?"

Sure, everybody can overlook how your ideas don't fit into
current linux - but then they have nothing more to say.
What people here do is to look at new ideas/patches, and
point out problems with them. When there are no more
problems left, the patch is applied and linux improves.

If we're not to point out problems with your ideas, then
they can't be used in linux. That is not hostility, but how
linux development works.

Mostly people try to save you from doing unnecessary work -
so you don't go and implement a flawed filesystem that
will be rejected immediately when you show up with the patch.

Shooting down bad ideas saves tremendous amounts of work,
killing an idea at the discussion stage means the idea never
got to the much more labor-intensive implementation stage.

This don't mean that all new ideas are killed, only the bad ones.


If you want your ideas accepted, you'll have to come up
with something that isn't flawed, that is well planned, not
just a bunch of "well - we could do *that* perhaps" but then
it turns out that *that* idea was flawed as well.

Helge Hafting

Helge Hafting
Nix
2007-08-20 19:52:45 UTC
Permalink
Post by Helge Hafting
Shooting down bad ideas saves tremendous amounts of work,
killing an idea at the discussion stage means the idea never
got to the much more labor-intensive implementation stage.
This don't mean that all new ideas are killed, only the bad ones.
Even the good ones pretty much never start out perfect[1]. It always
takes some criticism and fixing before the code is right, let alone
the design.

(Of course part of the problem with this idea is that it was so vague
that the only problems there *could* be with it were huge gaping ones:
you can't have subtle problems with an idea with no subtle elements.
Unfortunately it had a lot of those, and fixing them without junking
the whole idea would be hard. No, I haven't thought about how to do
it: Marc might like to, though, what with it being his idea and all.)

[1] except mingo's. mingo has the Perfect Design Mojo. How this came
to pass, mere mortals may not speculate.
Randy Dunlap
2007-08-20 16:21:25 UTC
Permalink
Post by Marc Perkel
My rant on VI is to make a point. That point being
that when you use an editor that totally sucks then
it's going to cause you to write code that sucks. It
going to lower your standards. It's going to create a
culture where poorly done work is considered
acceptable. When you use an editor as poor as vi then
the idea that rm * doesn't work becomes acceptable and
justifiable, as demonstrated here by people who
ACTUALLY DEFENDED IT.
so enlighten us. What editor do you use/suggest/push?

---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***
Xavier Bestel
2007-08-20 16:20:20 UTC
Permalink
Post by Randy Dunlap
so enlighten us. What editor do you use/suggest/push?
Corbet !
Phillip Susi
2007-08-20 14:29:22 UTC
Permalink
Post by Marc Perkel
No Al, there isn't any shortage of arrogance here.
Let me try to repeat what I'm talking about as simply
as I can.
First - I'm describing a kind of functionality and
suggesting Linux should have it. I know a lot of it
can be done because much of what I'm suggesting is
already working in Windows and Netware.
Let's not get confused here... ._I_ am suggesting a feature that windows
and netware have done already, that being inheritable acls. YOU
suggested that permissions be configurable based on pattern matching on
the name of the file.
Lennart Sorensen
2007-08-20 15:13:06 UTC
Permalink
Post by Marc Perkel
No Al, there isn't any shortage of arrogance here.
Yes you provide plenty yourself.
Post by Marc Perkel
Let me try to repeat what I'm talking about as simply
as I can.
First - I'm describing a kind of functionality and
suggesting Linux should have it. I know a lot of it
can be done because much of what I'm suggesting is
already working in Windows and Netware.
And how many limitations does windows and netware have that linux
doesn't have, exactly due to that special functionality? part of the
usefulness of permissions on unix is that they are so simple to
understand and manage. They are really hard to screw up. The
permission systems on windows and netware are much more complex and much
hardware for people to understand, and hence much easier to get wrong
causing potential security problems on your system without you realizing
it.
Post by Marc Perkel
I'm not the one who's going to code it. I'm just
saying that it would be nice if Linux had the
functionality of other operating systems - and - take
it to the next level - match it and do even better.
Many of the bugs of other operating systems would be nice to avoid. In
some cases what you may see as a feature is a major cause of problems
especially performance problems.
Post by Marc Perkel
As to thinking outside the box, what I'm proposing is
outside the box relative to Linux. It's not as
original as compared to Windows or Netware which is
even better.
How about dropping that stupid phrase.
Post by Marc Perkel
The idea is that Linux is lacking features that other
OSs have. What I'm suggesting is that Linux not only
match it but to create an even more powerful rights
layer that is more powerful than the rest and I'm
outlining a concept in the hopes that people would get
excited about the concept and want to build on the
idea.
Well so far your system has been shown to be very inefficient (with no
solution to that proposed by anyone). Many of the ideas have been shown
to have been tried in the past and then abandoned due to not working
very well or having severe scalability problems.

There are lots of people (very often people currently in university) who
suffer from the problem of believing anything they think of that isn't
in current use must be a new unique idea that must be used because no
one else had thought of it. In almost every case it is not a new idea,
and the reason it is not in use is because it didn't work the last time
someone thought of it and tried it. Or in some cases it is in use but
just not in a field the new genious has any knowledge of.
Post by Marc Perkel
I'm just telling you what I'd like to see. I'm not
going to code it. So I'm only going to talk about what
is possible. How it's done will be up to any
programmers who might be inspired by the idea. If no
one is inspired the Linux will continue to be in last
place when it comes to file system features relating
to fine grain permissions.
How do you know what is possible? Could it be that many of the problems
on netware and windows are caused by their choice of filesystem design?
Post by Marc Perkel
In Linux, for example, users are allowed to delete
files that they are prohibited from reading or
writing. In Netware if a user can't read or write to
the file they won't even be able to see that the file
exists, let alone delete it.
In some cases it makes sense. After all you are not deleting the file,
you are delting one hardlink to it and the hardlink is part of the
directory. If other directories have hardlinks to the same file, those
get to stay around. Hence the permissions make sense.
Post by Marc Perkel
In Netware I can move a directory tree into another
tree and the objects that have rights in the other
tree will have rights to all the new files without
having to run utilities on the command line to
recursively change the permission afterwards.
Does netware have hardlinks? Does it have chroots? bind mounts? How
about the ability to delete open files without causing problems for
programs that have the files open?
Post by Marc Perkel
The point - Linux isn't going to move forward and
catch up unless there is a fundamental change in the
thinking behind Linux permissions. There is a
cultural lack of innovation here. I discussed this
with Andrew Morton and he made some suggestions but
there's real hostility towards new concepts here.
Something I don't understand. At some point Linux
needs to grow beyond just being an evolved Unix clone
and that's not going to happen if you don't think
differently.
Linux seems to be gaining ground in usage. Netware on the other hand
seems just about dead. What does that say about things?
Post by Marc Perkel
I still believe that the VI editor causes brain
damage. :)
You should see what word and notepad do to people.

--
Len Sorensen
Phillip Susi
2007-08-20 14:24:51 UTC
Permalink
50ms to update in-memory dentries would be FRIGGING TERRIBLE!!! Using
Perl, an interpreted language, the following script takes 3.39s to run
for (0 .. 10000) {
mkdir "a-$_";
mkdir "b-$_";
rename "a-$_", "b-$_";
}
It's not even deleting things afterwards so it's populating a directory
with ten thousand entries. We can easily calculate 10,000/3.39 = 2,949
entries per second, or 0.339 milliseconds per entry.
When I change it to rmdir things instead, the runtime goes down to 2.89s
== 3460 entries/sec == 0.289 milliseconds per entry.
If such a scheme even increases the overhead of a directory rename by a
hundredth of a millisecond on that box it would easily be a 2-3%
performance hit. Given that people tend to kill for 1% performance
boosts, that's not likely to be a good idea.
The question is how many dentries are cached at the time? And it looks
like you are just renaming, not moving, so there would be no need to
recompute the acls at all.
Marc Perkel
2007-08-15 22:40:16 UTC
Permalink
Post by Kyle Moffett
Al Viro added to the CC, since he's one of the
experts on this stuff
and will probably whack me with a LART for
explaining it all wrong,
or something. :-D
Thanks - I appreciate that.

Just to catch everyone up on what this thread is
about, I'm proposing a new way of looking at file
systems where files no longer have permission, owners,
or groups, or file attributes. The idea is that
people, groups, managers, applications, and other
objects have permissions to names that are pointers to
files.
Post by Kyle Moffett
Post by Phillip Susi
Post by Kyle Moffett
We've *always* had to do this; that's what "chmod
-R" or "setfacl -
Post by Phillip Susi
Post by Kyle Moffett
R" are for :-D. The major problem is that the
locking and lookup
Post by Phillip Susi
Post by Kyle Moffett
overhead gets really significant if you have to
look at the entire
Post by Phillip Susi
Post by Kyle Moffett
directory tree in order to determine the
permissions for one
Post by Phillip Susi
Post by Kyle Moffett
single object. I definitely agree that we need
better GUIs for
Post by Phillip Susi
Post by Kyle Moffett
managing file permissions, but I don't see how
you could modify
Post by Phillip Susi
Post by Kyle Moffett
the kernel in this case to do what you want.
I am well aware of that, I'm simply saying that
sucks. Doing a
Post by Phillip Susi
recursive chmod or setfacl on a large directory
tree is slow as all
Post by Phillip Susi
hell.
Doing it in the kernel won't make it any faster.
Post by Phillip Susi
As for hard links, your access would depend on
which name you use
Post by Phillip Susi
to access the file. The file itself may still
have an acl that
Post by Phillip Susi
grants or denies access to people no matter what
name they use, but
Post by Phillip Susi
if it allows inheritance, then which name you
access it by will
Post by Phillip Susi
modify the effective acl that it gets.
You can't safely preserve POSIX semantics that way.
For example,
even without *ANY* ability to read /etc/shadow, I
can easily "ln /etc/
shadow /tmp/shadow", assuming they are on the same
filesystem. If
the /etc/shadow permissions depend on inherited ACLs
to enforce
access then that one little command just made your
shadow file world-
readable/writeable. Oops.
Permissions depend on *what* something is, not
*where* it is. Under
Linux you can leave the digital equivalent of a
$10,000 piece of
jewelry lying around in /var/www and not have to
worry about it being
compromised as long as you set your permissions
properly (not that I
recommend it). Moving the piece of jewelry around
your house does
not change what it *is* (and by extension does not
change the
protection required on it), any more than "ln
/etc/shadow /tmp/
shadow" (or "mv") changes what *it* is. If your
/house is really
extraordinarily secure then you could leave the
jewelry lying around
as /house/gems.bin with permissions 0777, but if
somebody had a back-
door to /house (an open fd, a careless typo, etc)
then you'd have the
same issues.
My proposal is the same somewhat. If one put
restricting on a specific name to deny access to users
then that denial follows that filename even if it is
copied or moved. However if a file has no specific
restrictions and is in a restricted directory then the
file inherits the restrictions and permissions of the
new directory based on where it is.

If you don't want your jewlery laying around then
don't put a copy of it in a folder where users have
access to it.




Marc Perkel
Junk Email Filter dot com
http://www.junkemailfilter.com



____________________________________________________________________________________
Yahoo! oneSearch: Finally, mobile search
that gives answers, not web links.
http://mobile.yahoo.com/mobileweb/onesearch?refer=1ONXIC
Marc Perkel
2007-08-15 17:02:29 UTC
Permalink
Post by Marc Perkel
Post by Marc Perkel
For example. If you list a directory you only see
the
Post by Marc Perkel
files that you have some rights to and files where
you
Post by Marc Perkel
have no rights are invisible to you. If a file is
read
Post by Marc Perkel
only to you then you can't delete it either.
Having
Post by Marc Perkel
write access to a directory really means that you
have
Post by Marc Perkel
file create rights. You can also delete files that
you
Post by Marc Perkel
have write access to. You would also allocate
permissions to manage file rights like being able
to
Post by Marc Perkel
set the rights of inferior users.
Imagine the fun you will have trying to write a file
name and being told
you cannot write it for some unknown reason.
Unbeknownst to you, there is
a file there, but it is not owned by you, thus
invisible.
Making a file system more user oriented would avoid
little gotchas like
this. The reason it is "programmer oriented" is
that those are the people
who have worked out why it works and why certain
things are bad ideas.
That not a problem - it's a feature. In such a
situation the person would get a general file creation
error. Although it isn't likely people would structure
files with invisible files in directories that the
user has create permissions it is logical that if I
put a file in a place where the user has no rights I
want it to stay there. Currently the user can delete
files where they have no rights.

I might also want to restrict the kind of a user can
createor give permission to create only certian file
names.

/etc/vz/conf/*.conf - create - readonly - self-rw
/etc/vz/conf - deny

This would allow the user to read all *.conf files,
create new *.conf files, and full permissions to
read/write/delete files that the user created but not
files that others created. If listing a directory then
only the *.conf files would appear even if other files
are in the directory.


Marc Perkel
Junk Email Filter dot com
http://www.junkemailfilter.com



____________________________________________________________________________________
Building a website is a piece of cake. Yahoo! Small Business gives you all the tools to get online.
http://smallbusiness.yahoo.com/webhosting
Michael Tharp
2007-08-15 17:30:29 UTC
Permalink
Post by Marc Perkel
That not a problem - it's a feature. In such a
situation the person would get a general file creation
error.
Feature or not, it's still vulnerable to probing by malicious users. If
there are create permissions on the directory, the invisibility is not
perfect.
Post by Marc Perkel
Although it isn't likely people would structure
files with invisible files in directories that the
user has create permissions [...]
... /tmp ...
Post by Marc Perkel
[...] it is logical that if I
put a file in a place where the user has no rights I
want it to stay there. Currently the user can delete
files where they have no rights.
Indeed. The sticky bit works around this, but IMHO it's a hack.
Post by Marc Perkel
I might also want to restrict the kind of a user can
createor give permission to create only certian file
names.
/etc/vz/conf/*.conf - create - readonly - self-rw
/etc/vz/conf - deny
This would allow the user to read all *.conf files,
create new *.conf files, and full permissions to
read/write/delete files that the user created but not
files that others created. If listing a directory then
only the *.conf files would appear even if other files
are in the directory.
It'd be interesting to find a use case for this, but that's no reason
not to provide the functionality.
Post by Marc Perkel
Marc Perkel
Junk Email Filter dot com
http://www.junkemailfilter.com
-- m. tharp
Marc Perkel
2007-08-15 17:51:02 UTC
Permalink
Post by Marc Perkel
Post by Marc Perkel
That not a problem - it's a feature. In such a
situation the person would get a general file
creation
Post by Marc Perkel
error.
Feature or not, it's still vulnerable to probing by
malicious users. If
there are create permissions on the directory, the
invisibility is not
perfect.
In a real world situation I would think that users
probing for invisible files is more secure that users
knowing the names of files that they have no access
to.
Post by Marc Perkel
Post by Marc Perkel
Although it isn't likely people would structure
files with invisible files in directories that the
user has create permissions [...]
... /tmp ...
You're still thinking inside the box. Let's take the
tmp directory for example. /tmp wpuld probably g away
in favor of persomal /tmp directories. As we all know,
/tmp is the source of a lot of vulnerabilities.

One might put a name translation mask on the /tmp name
in the file name translation system. For example:

/tmp -> my /tmp

Thus files written to /tmp would become /mperkel/tmp
and users wouldn't be able to see other users /tmp
files or have any name conflicts.

Let me explain about the concept of thinking outside
the box. If you run into a problem you figure out a
new solution. It's about finding ways to make things
work rather than finding ways to make things not work.

So - we are not only talking about a name permission
system but a file name translation system. Thus a
user's view of the file system might not be the same
for all users. In fact, let's say that mperkel is a
Windows user and is just attacking to Linus as a file
system. Because mperkel is in the windows group the
file system appears as h:\home\mperkel on a native
Linux level and mounts are drive letters. It would use
a Windows name translation mask program that would be
part of the permission/naming system.




Marc Perkel
Junk Email Filter dot com
http://www.junkemailfilter.com



____________________________________________________________________________________
Sick sense of humor? Visit Yahoo! TV's
Comedy with an Edge to see what's on, when.
http://tv.yahoo.com/collections/222
Tim Tassonis
2007-08-15 07:49:00 UTC
Permalink
Post by Marc Perkel
The ACLs that were added to Linux were a step in the
right direction but very incomplete. What should be is
a complex permission system that would allow fine
grained permissions and inherentance masks to control
what permission are granted when someone moves new
files into a directory. Instead of just root and users
there would be mid level roles where users and objects
had management authority over parts of the system and
the roles can be defined in a very flexible way. For
example, rights might change during "business hours".
The problem with complex permission systems is, well, they are complex...

I'd still go for the UNIX KISS philosophy and the rather easy permission
system, as it is easier to manage. Windows has all that great permission
stuff, but if you look at the reality, hardly anybody uses it due to its
complexity.

Tim
Brian Wheeler
2007-08-15 18:23:12 UTC
Permalink
HI

While I find your ideas intriguing, I'd like to offer a friendly
suggestion: You're never going to convince anyone on LKML unless you do
the following things:
* Describe your idea in detail, including algorithms, pseudo code,
pictures, or whatever. Vague hand-wavey stuff won't do it.
* Don't accuse others of being close minded (i.e. "not thinking outside
the box"). Explain why their assertions may not be correct according to
your proposal.
* Accept that others have far more experience, no matter how
experienced you are.
* Remember that the current assumptions can't just break when the idea
is implemented: backwards compatibility is important.

Also, be prepared to have your idea shot down. Very few ideas make it
into the kernel, and those only get there after months of hashing out
the details.

Brian
Post by Marc Perkel
One of the problems with the Unix/Linux world is
that your minds
Post by Marc Perkel
are locked into this one model. In order to do it
right it requires
Post by Marc Perkel
the mental discipline to break out of that.
The major thing that you are missing is that this
"one model" has
been very heavily tested over the years. People
understand it, know
how to use it and write software for it, and grok
its limitations.
There's also a vast amount of *existing* code that
you can't just
"deprecate" overnight; the world just doesn't work
that way. The
real way to get there (IE: a new model) from here
(IE: the old model)
is the way all Linux development is done with a lot
of sensible easy-
to-understand changes and refactorings.
With that said, if you actually want to sit down and
start writing
*code* for your model, go ahead. If it turns out to
be better than
our existing model then I owe you a bottle of your
favorite beverage.
Cheers,
Kyle Moffett
When one thinks outside the box one has to think about
evolving beyond what you are used to. When I moved
beyond DOS I have to give up the idea of 8.3 file
names. The idea here is to come up with a model that
can emulate the existing system for backwards
compatibility.

The concept behind my model is to create a new layer
where you can do ANYTHING with file names and
permissions and create models that emulate Linux, DOS,
Windows, Mac, or anything else you can dream of. Then
you can create a Linux/Windows/Mac template to emulate
what you are used to.


Marc Perkel
Junk Email Filter dot com
http://www.junkemailfilter.com
Yakov Lerner
2007-08-15 20:02:05 UTC
Permalink
Post by Marc Perkel
I want to throw out some concepts about a new way of
thinking about file systems. But the first thing you
have to do is to forget what you know about file
systems now. This is a discussion about a new view of
looking a file storage that is radically different and
it's more easily undersood if you forget a lot of what
you know. The idea is to create what seems natural to
the user rather than what seems natural to the
programmer.
I believe that kernel interface is not really meant to be operated
on the level that's directly accessible by the end user.
The food chain is a bit different. The human user interacts
with userlevel apps, not with kernel API directly. The userlevel apps
interact in turn with the kernel APIs, either directly or via the
layers of libraries.

The abstrations that's presented to the human use are not necessarily 1:1
reflection of the kernel APIs. For example, you could program your novel
way of permissions as a new file manager application that
actually uses existing posix permissions underneath. Your file
manager could check with userlevel-stored policies to
implement the permissions that you describe, without any
changes in existing kernel and existing filesystems.

To expect that posix-compliant kernel will drop its posix
complicance for temporary experiment sounds a bit far-fetched
to me. But the door for all kinds of experiments is widely
open in the userlevel.

Yakov
Tim Tassonis
2007-08-20 11:54:13 UTC
Permalink
Hi Marc
Post by Marc Perkel
What's the point? People are openly hostile to new
ideas here. I started out nice and laid out my ideas
and you have a bunch of morons who attack anything
new.
If you think using subjects like "Thinking out of the box" (implicitely
calling everybody else narrow-minded) and "vi causes brain damage" is
starting out nice, you also got a serious communication problem.
Post by Marc Perkel
Look at the reality of the situation. Linux is free
and yey it can't compete with operating systems that
are paid for. Maybe the reason is that when someone
point out the something is broken all yopu get is
justification and excuses and insults.
Funny, even Microsoft acknoledges that Linux can very well compete, as
does your beloved Novell that just recently bought Suse. Maybe you
haven't noticed yet that you're the only one left that thinks Linux
can't compete.
Post by Marc Perkel
Think about it. Why did it take 20 years for Linux to
fix the RM problem? If you type RM * you expect the
files to be gone, not some stupid error that I'm
trying to delete too many files.
Well, it's not a stupid error, this is called a limit. Other people have
already explained to you how the UNIX shell works, so I'm not going to
repeat it again.

That said, I even would admit that I have been bitten by this limit
before (deleting a few thousand bounced mail in a spool directory),
Post by Marc Perkel
So who's fault is that? I say it's a problem with
Linux culture. If something is broken you have to
justify it instead of fixing it.
I use Linux since the mid 90's and remember thousands and thousands of
bugs fixed and limits removed. But you must be here longer and have the
better view of how "the Linux culture" really works.
Post by Marc Perkel
You guys are trying to may the RM problem MY FAULT
because I didn't say it nicely. We'll it doesn't have
to be said nicely. If something is broken then it
needs fixed regardless of who and how it is pointed
out.
Nobody denied the limit, you were just pointed out that you don't have a
fucking clue what the behavior actually means and where the limit lies.
And calling other people brain-damaged at the same time...
Post by Marc Perkel
So wat does it tell you when something like this is
left broken for so long? What it tells me is that the
development process is broken.
Well, it tells _me_:
- It is a limit and not a bug
- The limit is not severe, not many people constantly have to delete
millions of files in the directory without deleting the directory itself
- the limit can be worked around by "find . |xargs \rm"

But, as you proved again and again, you're the expert.
Post by Marc Perkel
My rant on VI is to make a point. That point being
that when you use an editor that totally sucks then
it's going to cause you to write code that sucks. It
going to lower your standards. It's going to create a
culture where poorly done work is considered
acceptable. When you use an editor as poor as vi then
the idea that rm * doesn't work becomes acceptable and
justifiable, as demonstrated here by people who
ACTUALLY DEFENDED IT.
You might have wanted to make this point. But all you really showed is
that you're an arrogant, ignorant loudmouth, takling about things you
have no clue about. I bet you haven't written a single line of decent
code in your life.

Kind regards
Tim
Loading...