[Toybox] [PATCH] Implement mv -n / cp -n (no clobber).

Post by Andy Chu
This fixes a failing test case in mv.test.

There are actually three modes:

-n will leave the existing file.
-f will delete the existing file only if it can't write to it.
--remove-destination will delete the existing file before trying to
write to it.

Alas, for installing, you want --remove-destination. Which is so very
non-posix it doesn't even have a short option. cp -f will stomp a toybox
or busybox binary so everything is bunzip2, and -n will leave the
existing file (not install the new one).

That's why this is still on my todo list, I wanted to have
scripts/install.sh --force do --remove-destination, but can't assume the
host's cp has it. I should just make it rm -f then cp -n in two commands...

Rob

Andy Chu

2016-03-21 07:03:30 UTC

Post by Andy Chu
This fixes a failing test case in mv.test.

-n will leave the existing file.
-f will delete the existing file only if it can't write to it.
--remove-destination will delete the existing file before trying to
write to it.
Alas, for installing, you want --remove-destination. Which is so very
non-posix it doesn't even have a short option. cp -f will stomp a toybox
or busybox binary so everything is bunzip2, and -n will leave the
existing file (not install the new one).
That's why this is still on my todo list, I wanted to have
scripts/install.sh --force do --remove-destination, but can't assume the
host's cp has it. I should just make it rm -f then cp -n in two commands...

Maybe I'm not understanding --remove-destination correctly, but does
this patch make it any harder to implmene it?

Even after reading the coreutils manual, I see understand a situation
where you would want to use --remove-destination instead of --force.
I guess this is not the situation is applies in but I can't think of
another one.

$ ls -l
-rw-rw-r-- 1 andy andy 4 Mar 20 23:58 foo
-r--r--r-- 1 andy andy 5 Mar 20 23:58 hard
-r--r--r-- 1 andy andy 6 Mar 20 23:58 hard2

$ cp foo hard
cp: cannot create regular file ‘hard’: Permission denied
$ cp --force foo hard
$ cp --remove-destination foo hard2

The result seems the same either way ? The destination is copied over.

Andy

Rob Landley

2016-03-26 04:57:22 UTC

Post by Rob Landley
That's why this is still on my todo list, I wanted to have
scripts/install.sh --force do --remove-destination, but can't assume the
host's cp has it. I should just make it rm -f then cp -n in two commands...

Maybe I'm not understanding --remove-destination correctly, but does
this patch make it any harder to implmene it?

No, that's just the rathole of analysis I go down when I try to think
through what the correct behavior is here, and then I get distracted
by interrupt du jour. :P

Sorry, properly reading through your patch now.

Post by Andy Chu
Even after reading the coreutils manual, I see understand a situation
where you would want to use --remove-destination instead of --force.
I guess this is not the situation is applies in but I can't think of
another one.

In the linux from scratch build the bunzip2 package would cp bunzip2 /usr/bin
(as root) and if there was already a symlink there to busybox, it would
follow the symlink and overwrite the busybox binary, and brick the chroot.

(Yeah stock LFS installs itself in /tools to avoid that, but I just made a
chroot I could selectively build and install arbitrary packages in. Yes,
this is a chroot under qemu but the root filesystem aboriginal linux boots
into is combination of tmpfs (limited space) and squashfs (read-only),
so it just copies everything into a clean directory under /home (ext3
partition with 2 gigs of free space) and chroots into it to run the build
in a context where you _can_ install packages and it should just upgrade
the system you're running.)

If bunzip2 does "cp -f bunzip2 /usr/bin" this doesn't help, because -f only
changes the behavior if it COULDN'T overwrite the file. (It tries to overwrite,
does an rm, and tries to overwrite a second time.)

If bunzip2 does "cp -n bunzip2 /usr/bin" it doesn't replace the host busybox,
so the install (silently) fails leaving the busybox version behind.

This is why I removed the writeable bit from the toybox binary, even for
root. That way, "cp -f command /usr/bin/symlink" _does_ work. It's a bit
of a hack, but multiplexer binaries are a bit special.

However, that doesn't fix bunzip2 installing over _busybox_, and I also have
the question of "how should TOYBOX install itself, given that it might be
stomping over busybox or maybe the gzip package that busybox copied the
multiplexer idea from back in the 90's?" (gunzip was a synonym for gzip -d
in the gzip 1.2.4 package the oldest versions of busybox contain, see
'Changes in applet structure' from https://busybox.net/~landley/forensics.txt
and yes that file _is_ what happens when I get angry enough).

So that's why I was thinking of adding cp --big-long-optinon-name to
the toybox install.sh, but I just went with "rm -f oldname && cp new oldname"
in the recent cleanup instead, because I can't trust the host's cp to have
that option _before_ toybox is installed, and we can't use the toybox
cp to install itself when cross compiling.

Post by Andy Chu
The result seems the same either way ? The destination is copied over.

Yes, but sometimes you want it to be _replaced_, not copied over.

$ echo hello > one
$ ln -s one two
$ echo test > test
$ cp test two
$ cat one two
test
test
$ echo hello > one
$ cp -f test two
$ cat one two
test
test
$ echo hello > one
$ cp -n test two
$ cat one two
hello
hello
$ cp --remove-destination test two
$ cat one two
hello
test

Replacing one of a constellation of symlinks that didn't think to chmod -r
the central binary turns out to be fiddly otherwise. Unless you want the
rm and cp to be separate commands (which admittedly isn't any _more_ racy...
unless the "cp" command is what you're replacing and NOW I remember
why I didn't do that earlier. I broke upgrading toybox by installing over it
on a system that's using it. Although it's actually the toybox binary that
has this problem because ln -f works fine for replacing the individual
command names, but "rm /usr/bin/toybox; cp toybox /usr/bin/toybox" fails
with cp not found if it's a symlink to toybox on the host...)

Sigh...

Rob

Andy Chu

2016-03-26 05:57:29 UTC

Post by Rob Landley
So that's why I was thinking of adding cp --big-long-optinon-name to
the toybox install.sh, but I just went with "rm -f oldname && cp new oldname"
in the recent cleanup instead, because I can't trust the host's cp to have
that option _before_ toybox is installed, and we can't use the toybox
cp to install itself when cross compiling.

OK thanks for the test case -- I ran it and it helps me understand
what --remove-destination does. Although I'm still wondering what is
wrong with rm && cp? As you say yourself, having it in one command
doesn't remove any race conditions.

I guess you're saying you can get by with -f alone with toybox,
because you can control whether it is writable. But that doesn't
solve the problem with busybox or other multiplexer binaries which you
don't control? Did I paraphrase right?

Either way I still don't see what's wrong with rm && cp. I thought
Aboriginal Linux was supposed to be the minimal set of things needed
to build itself... and if toybox followed that philosophy then it
would also leave out what can be accomplished by sequential
composition in the shell :)

Not that I am really arguing against adding --remove-destination --
just curious.

Honestly this entire discussion is reminding me a Unix deficiency I've
noticed. For background, in the "cloud" world (as opposed to the
embedded world), people tend to set up their base image with something
like Chef, Puppet, or Ansible, which are basically horrible
Ruby/Python DSLs with embedded shell snippets (and of course nobody
knows how to quote correctly when shell is embedded in yet another
language...).

My reaction is: why don't you just use shell scripts to configure your
servers? (And others have had the same thought, e.g.
https://github.com/brandonhilkert/fucking_shell_scripts , although
ironically it depends on Ruby ...)

Well the one good argument is that those systems are supposed to be
idempotent, whereas shell is not idempotent. To be idempotent, you
would basically describe a final state, without regard to the existing
state -- which is not really possible with shell.

In distributed systems, safe retries are essential. Perhaps a more
immediate example is that you would want to be able to Ctrl-C your
shell script at an *arbitrary* point in time and have it work
correctly the second time (without resetting the state back to what it
was the first time.)

Examples:

# mkdir can't be run twice; it fails the second time because the dir
exists. mkdir -p oddly conflates the behavior of ignoring existing
dirs with creating intermediate dirs
$ mkdir dir

# likewise rm can't be run twice; the second time it will fail because
the file doesn't exist. --force conflates the behavior of ignoring
missing arguments with not prompting for non-writable files
$ rm foo

# behavior depends on whether bar is an existing directory, -T /
--no-target-directory fixes this I believe
$ cp foo bar

Anecdotally, it seems like a lot of shell script issues are caused by
unexpected existing state, but in a lot of cases you don't CARE about
the existing state -- you just want a final state (e.g. a bunch of
symlinks to toybox). That seems to be a common thread in a lot of
situations you're describing if I'm not mistaken.

So if Unix tools had flags that made them behave in an idempotent
manner, there would be less objection to use them for cloud server
management. They would be more "declarative" and less imperative.

Anyway, that's a bit of a tangent... (One reason I'm interested in
toybox is that I've had a longstanding plan to write my own shell and
toybox/busybox are obvious complements to a shell. Though it's
interesting that busybox has two shells and toybox has zero, I think
my design space is a little different in that I want it to be sh/bash
compatible but also have significant new functionality.)

Andy

Rob Landley

2016-03-26 07:00:01 UTC

I remembered it as I was typing, and it was the next paragraph of the
previous email. On a system where cp is currently a symlink to toybox,
and you're installing a new toybox over that:

rm /usr/bin/toybox
cp toybox /usr/bin/toybox #fails because cp is dangling symlink

It's not a race condition, it's a "you can keep running a binary after
it's deleted but can't launch new instances" problem.

Post by Andy Chu
I guess you're saying you can get by with -f alone with toybox,
because you can control whether it is writable. But that doesn't
solve the problem with busybox or other multiplexer binaries which you
don't control? Did I paraphrase right?

More or less, yes.

Post by Andy Chu
Either way I still don't see what's wrong with rm && cp. I thought
Aboriginal Linux was supposed to be the minimal set of things needed
to build itself... and if toybox followed that philosophy then it
would also leave out what can be accomplished by sequential
composition in the shell :)

Aboriginal Linux lets you build linux from scratch under it. I added a
number of commands that aren't needed to bootstrap aboriginal, but _are_
necessary to bootstrap LFS.

Thus _it_ can get away with not having lex because you can build and
install lex natively after the fact. You can piecemeal supplement what I
provide, with replacing the existing stuff if you don't want to.

With toybox, it's an all-or-nothing thing. If you really need some
option toybox doesn't provide _within_ a command, you basically have to
install another implementation and stop using the toybox implementation
of that command. You can't supplement with more options after the fact.

This doesn't mean I implement every option, but it does mean I have to
consider them and decide whether or not to.

Post by Andy Chu
Not that I am really arguing against adding --remove-destination --
just curious.

Honestly, the main reason I haven't added it so far is it doesn't have a
short option. It's only a line or two of code to implement, but the
command line interface is ugly, and I'm not sure how much it's really
used out there. It _does_ solve some specific problems you can't solve
without it, but they're not very common problems.

Until recently I had CP_MORE so you could configure a cp with only the
posix options, but one of the philosophical differences I've developed
since leaving toybox is all that extra configuration granularity is
nuts. The space savings aren't worth the fact you no longer get
consistent debugging coverage (you may have configurations that don't
BUILD, although switching from #ifdef to if (CFG_BLAH) cuts that down a
lot), that there's increased cognitive load on the package's users to
_configure_ the thing (deciding what commands to include is hard enough)
and then when you've got a deployed system the package provides
inconsistent behavior depending on how it's configured, so you can't say
"this is how toybox behaves".

So I threw out CP_MORE as a bad idea, and almost all commands just have
the "include it or not" option now. There are a few global options, but
not many, and I may even eliminate some of those (I18N: the world has
utf8 now, deal with it).

Post by Andy Chu
Honestly this entire discussion is reminding me a Unix deficiency I've
noticed. For background, in the "cloud" world (as opposed to the
embedded world), people tend to set up their base image with something
like Chef, Puppet, or Ansible, which are basically horrible
Ruby/Python DSLs with embedded shell snippets (and of course nobody
knows how to quote correctly when shell is embedded in yet another
language...).
My reaction is: why don't you just use shell scripts to configure your
servers? (And others have had the same thought, e.g.
https://github.com/brandonhilkert/fucking_shell_scripts , although
ironically it depends on Ruby ...)

Toybox has no external dependencies. You build a static binary, drop it
on the machine, and it should work. Any config files for stuff like mdev
or dhcp are always _optional_, and there should be sane default behavior
when they're not there.

That's an explicit design goal.

Post by Andy Chu
Well the one good argument is that those systems are supposed to be
idempotent, whereas shell is not idempotent. To be idempotent, you
would basically describe a final state, without regard to the existing
state -- which is not really possible with shell.

Aboriginal Linux is idempotent (modulo the /home mount
dev-environment.sh provides, which is intentionally persistent scratch
space), and it's driven by shell scripts. How? Simple: the root
filesystem is an initmpfs initialized from a read-only gzipped cpio, and
the native compiler is a squashfs filesystem (also read only). If you
add a build control image, that's another squashfs mounted on /mnt.

The lfs-bootstrap.hdc build image (which I'm 2/3 done updating to 7.8, I
really need to get back to that) does a "find / -xdev | cpio" trick to
copy the root filesystem into a subdirectory under /home and then chroot
into that, so your builds are internally persistent but run in a
disposable environment.

All this predates vanilla containers, I should probably add some
namespace stuff to it but haven't gotten around to it yet...

Post by Andy Chu
In distributed systems, safe retries are essential. Perhaps a more
immediate example is that you would want to be able to Ctrl-C your
shell script at an *arbitrary* point in time and have it work
correctly the second time (without resetting the state back to what it
was the first time.)
# mkdir can't be run twice; it fails the second time because the dir
exists. mkdir -p oddly conflates the behavior of ignoring existing
dirs with creating intermediate dirs

A) Elaborate on "oddly conflates" please? I saw it as 'ensure this path
is there'.

B) [ ! -d "$DIR" ] && mkdir "$DIR"

Post by Andy Chu
$ mkdir dir
# likewise rm can't be run twice; the second time it will fail because
the file doesn't exist. --force conflates the behavior of ignoring
missing arguments with not prompting for non-writable files

-f means "make sure this file is not there".

And you're not writing to the file's contents, you're unlinking it form
this directory. There could be twelve other hardlinks to the same inode
out there, rm doesn't care.

I admit "zero arguments are ok" is a really WEIRD thing posix asked for,
dunno why they did that, but I implemented it. (You can tell I didn't
write tests/rm.test because it hasn't got a test for "rm -f" with no
arguments. It should. That should totally be "tests/rm.pending"...)

Post by Andy Chu
$ rm foo
# behavior depends on whether bar is an existing directory, -T /
--no-target-directory fixes this I believe
$ cp foo bar

I do a lot of "cp/mv/rsync fromdir/. todir/." just to make this sort of
behavior shut up, but it's what posix said to do.

Post by Andy Chu
Anecdotally, it seems like a lot of shell script issues are caused by
unexpected existing state, but in a lot of cases you don't CARE about
the existing state -- you just want a final state (e.g. a bunch of
symlinks to toybox).

"ln -sf" actually works pretty predictably. :)

Post by Andy Chu
That seems to be a common thread in a lot of
situations you're describing if I'm not mistaken.

Yes and no. I've seen a lot of people try to "fix" unix and go off into
the weeds of MacOS X or GoboLinux. Any time a course of action can be
refuted by an XKCD strip, I try to pay attention. In this case:

https://xkcd.com/927/

Unix has survived almost half a century now for a _reason_. A corollary
to Moore's Law I noticed years ago is that 50% of what you know is
obsolete every 18 months. The great thing about unix is it's mostly the
same 50% cycling out over and over.

Sure, it's crotchety and idiosyncratic. So's the qwerty keyboard.

Post by Andy Chu
So if Unix tools had flags that made them behave in an idempotent
manner, there would be less objection to use them for cloud server
management. They would be more "declarative" and less imperative.

I've come to despise declarative languages. In college I took a language
survey course that covered prolog, and the first prolog proram I wrote
locked the prolog interpreter into a CPU-eating loop for an hour, in
about 5 lines. The professor looked at it for a bit, and then basically
said to write a prolog program that DIDN'T do that, I had to understand
how the prolog interpreter was implemented. And this has pretty much
been my experience with declarative languages ever since, ESPECIALLY make.

I wound up in embedded programming because I break everything. I broke
the command line tools, I broke libc, I broke the kernel, broke the
compiler and linker, and I debugged my way down through all of it. If
you have a one instruction race, I'll hit it. No really:

http://lkml.iu.edu/hypermail/linux/kernel/0407.3/0027.html

I do this kind of thing ALL THE TIME. I have a black belt in "sticking
printfs into things" because I BREAK DEBUGGERS. (I'm quite fond of
strace, though, largely because it's survived everything I've thrown at
it and is basically sticking a printf into the syscall entry for me so I
don't have to run the code under User Mode Linux anymore, where yes I
literally did that.)

Last month I posted links here to me debugging my way into the kernel
and back out again to find a bug in an entirely different process that
had already stopped running which was affecting my build. The only thing
unusual about that instance is I did a longish writeup of it.

So when a declarative language offers to make life easier by automating
away all the complexity and doing it for me? Yeah... I'll let other
people who aren't me get right on that.

Post by Andy Chu
Anyway, that's a bit of a tangent... (One reason I'm interested in
toybox is that I've had a longstanding plan to write my own shell

Heh. Me too. I spent rather a lot of 2006 and 2007 working out why you
can't ctrl-z out of the bash "read" builtin, for example, or that when
sourcing a file you _can_ pipe the output somewhere but if you ctrl-z
during that it aborts the "source" and the resume resumes the host
shell. (Maybe they fixed it?)

Oh, and when $(commands) produce NUL bytes in the output, different
shells do different things with them. (Bash edits them out but retains
the data afterwards.)

I was apparently pretty deep into this stuff in mid-2006:

http://landley.net/notes-2006.html#14-09-2006

But by december I was just _angry_ about anything fsf-related, which
seriously shortened my patience when prodding at bash to see what it did:

http://landley.net/notes-2006#25-12-2006

Of course between those, there was a bad case of Bruce:

http://lwn.net/Articles/202106/
http://lwn.net/Articles/202120/

Anyway, I'm trying to get back to the shell now but there are so many
other things that keep interrupting... :)

Post by Andy Chu
and
toybox/busybox are obvious complements to a shell. Though it's
interesting that busybox has two shells and toybox has zero, I think
my design space is a little different in that I want it to be sh/bash
compatible but also have significant new functionality.)

Other than "loop", what are you missing?

(For YEARS I've wanted to pipe the output of a command line into the
input of that same command line. Turns out to be hard, and no you can't
do it as an elf binary because "loop thingy | thingy" would have to have
everything after loop quoted and spawn another shell instance via
/bin/sh and it's just annoying. I've made it work with FIFOs but that
requires writeable space (where?) and cleaning up after a badly timed
ctrl-c is just awkward and it turns into this BIG THING...

(The first time I wanted it was while trying to make PPPOE work, using a
binary that expected to be run as a child of a process I didn't have and
was trying to fake with a long command line. The second time was when
doing a shell implementation of expect for a test suite, because I'm not
installing tcl, if expect is the only thing keeping an entire
programming language alive it should die already.)

Post by Andy Chu
Andy

Rob

Andy Chu

2016-03-27 18:25:25 UTC

Post by Rob Landley
rm /usr/bin/toybox
cp toybox /usr/bin/toybox #fails because cp is dangling symlink
It's not a race condition, it's a "you can keep running a binary after
it's deleted but can't launch new instances" problem.

OK I see now. Why not just:

mv /usr/bin/toybox /tmp/cp
/tmp/cp toybox /usr/bin/toybox

If that's really the only reason to use --remove-destination vs rm &&
cp, then it does seems superfluous (typical GNU bloat like cat -v).

Post by Rob Landley
Until recently I had CP_MORE so you could configure a cp with only the
posix options, but one of the philosophical differences I've developed
since leaving toybox is all that extra configuration granularity is
nuts.

FWIW I agree -- configuring support for flags *within* a command is
very fine grained and I doubt most people would use it.

Post by Rob Landley
So I threw out CP_MORE as a bad idea, and almost all commands just have
the "include it or not" option now. There are a few global options, but
not many, and I may even eliminate some of those (I18N: the world has
utf8 now, deal with it).

I agree utf-8 is the right choice... The expr.c code from mksh has a
bunch of multibyte character support at the end, which makes you
appreciate the simplicity of utf-8:

https://github.com/MirBSD/mksh/blob/master/expr.c

bash seems to talk with some regret over support for multibyte
characters: http://aosabook.org/en/bash.html

Post by Rob Landley
The lfs-bootstrap.hdc build image (which I'm 2/3 done updating to 7.8, I
really need to get back to that) does a "find / -xdev | cpio" trick to
copy the root filesystem into a subdirectory under /home and then chroot
into that, so your builds are internally persistent but run in a
disposable environment.
All this predates vanilla containers, I should probably add some
namespace stuff to it but haven't gotten around to it yet...

I'll have to look at Aboriginal again... but for builds, don't you
just need chroots rather than full fledged containers? (i.e. you
don't really care about network namespaces, etc.)

Oh one interesting thing I just found out is that you can use user
namespaces to fake root (compare with Debian's LD_PRELOAD fakeroot
solution)

Last year, I was using this setuid root executable
(https://git.gnome.org/browse/linux-user-chroot/commit/), which is a
nice primitive for reproducible builds (i.e. not running lots of stuff
as root just because you need to chroot).

And I see in their README they are pointing to a Bazel (google build
system) tool that has an option to fake root with user namespaces.
Although I'm not sure you want to make that executable setuid root.

Post by Rob Landley
A) Elaborate on "oddly conflates" please? I saw it as 'ensure this path
is there'.
B) [ ! -d "$DIR" ] && mkdir "$DIR"

It says this right in the help:

-p, --parents no error if existing, make parent directories as needed

I guess you can think of the two things as related, but it's easy to
imagine situations where you only want to create a direct descendant
and it's OK if it exists.

B) has a race condition whereas checking errno doesn't, and mkdir $DIR
|| true has the problem that it would ignore other errors.

Post by Andy Chu
# likewise rm can't be run twice; the second time it will fail because
the file doesn't exist. --force conflates the behavior of ignoring
missing arguments with not prompting for non-writable files

-f means "make sure this file is not there".

The help also describes the two different things it does:

-f, --force ignore nonexistent files and arguments, never prompt

The first behavior makes it idempotent... the second disables the
check when writing over read-only files, which is unrelated to
idempotency (and yes I get that you're modifying the directory and not
the file, but that's the behavior rm already has)

Post by Andy Chu
# behavior depends on whether bar is an existing directory, -T /
--no-target-directory fixes this I believe
$ cp foo bar

I do a lot of "cp/mv/rsync fromdir/. todir/." just to make this sort of
behavior shut up, but it's what posix said to do.

What does this do? It doesn't seem to do quite what -T does:

$ ls
bar foo # empty dirs
$ mv foo/. bar/.
mv: cannot move ‘foo/.’ to ‘bar/./.’: Device or resource busy
$ mv -T foo bar # now foo is moved over the empty dir bar

Post by Rob Landley
Yes and no. I've seen a lot of people try to "fix" unix and go off into
the weeds of MacOS X or GoboLinux. Any time a course of action can be
https://xkcd.com/927/
Unix has survived almost half a century now for a _reason_. A corollary
to Moore's Law I noticed years ago is that 50% of what you know is
obsolete every 18 months. The great thing about unix is it's mostly the
same 50% cycling out over and over.

Definitely agreed -- but that's why I'm not creating an alternative,
but starting with existing behavior and adding to it. That's one of
the reasons I am interested in toybox... to puzzle through all the
details of existing practice and standards where relevant, to make
sure I'm not inventing something worse :)

The motivation the idempotency is a long story... but suffice to say
that people are not really using Unix itself for distributed systems.
They are building non-composable abstractions ON TOP of Unix as the
node OS (new languages and data formats -- Chef/Puppet being an
example; Hadoop/HDFS; and tons of Google internal stuff). AWS is a
distributed operating system; Google has a few distributed operating
systems as well. It's still the early days and I think they are
missing some lessons from Unix.

Sure I could just go change coreutils and bash ... I've been puzzling
through the bash source code and considering that.

If one of your goals is to support Debian, I think you should be
really *happy* that they went through all the trouble of porting their
shell scripts to dash, because that means all the shell scripts use
some common subset of bash and dash. Portability means that the
scripts don't go into all the dark corners of each particular
implementation.

bash is like 175K+ lines of code, and If you wanted to support all of
it, I think you would end up with at least 50K LOC in the shell...
which is almost the size of everything in toybox to date. If on the
other hand you want a reasonable and compatible shell, rather than an
"extremely compatible" shell, it would probably be a lot less code...
hopefully less than 20K LOC (busybox ash is a 13K LOC IIRC, but it's
probably too bare)

Post by Rob Landley
I've come to despise declarative languages. In college I took a language
locked the prolog interpreter into a CPU-eating loop for an hour, in
about 5 lines. The professor looked at it for a bit, and then basically
said to write a prolog program that DIDN'T do that, I had to understand
how the prolog interpreter was implemented. And this has pretty much
been my experience with declarative languages ever since, ESPECIALLY make.

This is a long conversation, but I think you need an "escape hatch"
for declarative ones, and make has one -- the shell. If you don't
have an escape hatch, you end up with tortured programs that work
around the straightjacket of the declarative language. (But this is
not really related what I was suggesting with idempotency; this is
more of a semantic overload of "declarative")

Unfortunately GNU make's solution was not to rely on the escape hatch
of the shell, but to implement a tortured shell within Make (it has
looping, conditionals, functions, variables, string library functions,
etc. -- an entirely separate Turing complete language)

Make's abstraction of lazy computation is useful (although it needs to
be updated to support directory trees and stuff like that). But most
people are breaking the model and using it for "actions" -- as
mentioned, the arguments to make should be *data* on the file system,
and not actions; otherwise you're using it for the wrong job and
semantics are confused (e.g. .PHONY pretty much tells you it's a hack)

Post by Rob Landley
I do this kind of thing ALL THE TIME. I have a black belt in "sticking
printfs into things" because I BREAK DEBUGGERS. (I'm quite fond of
strace, though, largely because it's survived everything I've thrown at
it and is basically sticking a printf into the syscall entry for me so I
don't have to run the code under User Mode Linux anymore, where yes I
literally did that.)

I think the problem is that you expect things to actually work! :) A
lot of programmers have high expectations of software; users generally
have low expectations.

http://blog.regehr.org/archives/861 -- "How have software bugs trained
us? The core lesson that most of us have learned is to stay in the
well-tested regime and stay out of corner cases. Specifically, we will
... "

Another hacker who has the same experience:
http://zedshaw.com/2015/07/08/i-can-kill-any-computer/

I was definitely like this until I learned to stop changing defaults.
Nobody tests anything by the default configuration. Want to switch
window managers in Ubuntu? Nope, I got subtle drawing bugs related to
my video card. As penance for my lowered expectations, I try to work
on quality software...

Post by Rob Landley
Oh, and when $(commands) produce NUL bytes in the output, different
shells do different things with them. (Bash edits them out but retains
the data afterwards.)

Yeah hence my warning about trying to be too compatible with bash ...
Reading the aosabook bash article and referring to the source code
opened my eyes a lot. sh does have a POSIX grammar, but it's not
entirely useful, as he points out, and I see what he means when he
says that using yacc was a mistake (top-down parsing fits the shell
more than bottom-up).

On the other hand, writing a shell parser and lexer by hand is a
nightmare too (at least if you care about bugs, which most people seem
not to). I'm experimenting with using 're2c' for my shell lexer,
which seems promising.

Reading the commit logs of bash is interesting... all of its features
seem to be highly coupled. There are lots of lists like this where
one feature is compared against lots of other features:
http://git.savannah.gnu.org/cgit/bash.git/tree/RBASH . The test
matrix would be insane.

Other than "loop", what are you missing?

At a high level, I would say:

1) People keep saying to avoid shell scripts for serious "software
engineering" and distributed systems. I know a lot of the corner
cases and a lot of people don't, so that could be a defensible
position. You can imagine a shell and set of tools that were a lot
more robust (e.g. pedantically correct quoting is hard and looks ugly,
but also more than that)

2) Related: being able to teach shell to novices with a straight face.
Shell really could be an ideal first computing language, and it was
for many years. Python or even JavaScript is more favored now
(probably rightly). But honestly shell has an advantage in that to
*DO* anything, you need to talk to a specific operating system, and
Python and JavaScript have this barrier of portability. But the bar
has been raised in terms of usability -- e.g. memorizing all these
single letter flag names is not really something people are up to.

3) Security features for distributed systems ... sh is obviously not
designed for untrusted input (including what's on the file system).

I could get into a lot of details but I guess my first task is to come
up with something "reasonably" compatible with sh/bash, but with a
code structure that's extensible.

FWIW toybox code is definitely way cleaner than bash, though I
wouldn't necessarily call it extensible. You seem to figure out the
exact set of semantics you want, and then find some *global* minimum
in terms of implementation complexity, which may make it harder to add
big features in the future (you would have to explode and compress
everything again). I suppose that is why this silly -n patch requires
recalling everything else about cp/mv, like --remove-destination :)
But I definitely learned something from this style even though I'm not
sure I would use it for most projects!

Andy

Andy Chu

2016-03-27 18:46:26 UTC

Post by Andy Chu
3) Security features for distributed systems ... sh is obviously not
designed for untrusted input (including what's on the file system).

4) ... along these same lines, a lot of other people are annoyed by
the gratuitous and crippled shell in make:

https://github.com/apenwarr/redo -- implementation of a DJB design,
each rule is a shell file, so you automatically get the property that
changing the command line/rule invalidates the output, which Make
doesn't have)

https://github.com/danfuzz/blur -- from the author of the Android.mk
system (which slurps a big tree Make fragments into a single
namespaced Makefile). I guess he got sick of Makefiles after that ...

Both of these have issues but are interesting... I think it wouldn't
be too hard to make a shell in which you could IMPLEMENT make. It
seems like you've gone a little in that direction with your "isnewer"
checks in scripts/make.sh!

The problem is that all this stuff is at the foundational layer, and
being backward compatible with both shell AND make is a very tall
order (although I guess that's where my interests overlap with toybox)

Andy

Rob Landley

2016-03-28 06:53:36 UTC

Post by Andy Chu
3) Security features for distributed systems ... sh is obviously not
designed for untrusted input (including what's on the file system).

4) ... along these same lines, a lot of other people are annoyed by
https://github.com/apenwarr/redo -- implementation of a DJB design,
each rule is a shell file, so you automatically get the property that
changing the command line/rule invalidates the output, which Make
doesn't have)
https://github.com/danfuzz/blur -- from the author of the Android.mk
system (which slurps a big tree Make fragments into a single
namespaced Makefile). I guess he got sick of Makefiles after that ...
Both of these have issues but are interesting... I think it wouldn't
be too hard to make a shell in which you could IMPLEMENT make. It
seems like you've gone a little in that direction with your "isnewer"
checks in scripts/make.sh!

Writing a new _make_ is very interesting to me. That desperately needs
to be replaced, as I explained in the "make rant" URLs I linked to last
message:

http://lists.landley.net/pipermail/aboriginal-landley.net/2011-June/000859.html

http://lists.landley.net/pipermail/aboriginal-landley.net/2011-June/000860.html

However, Elliott told me Android is in the process of doing that:

https://twitter.com/landley/status/691668319481364480

I really need to look into that build system. It's on the todo list. :)

(But that won't get me out of implementing a make that can build Linux
From Scratch and Buildroot and so on. If that has gnu extensions, then
it has gnu extensions. If it needs a built-in shell, it can use toysh.)

Post by Andy Chu
The problem is that all this stuff is at the foundational layer, and
being backward compatible with both shell AND make is a very tall
order (although I guess that's where my interests overlap with toybox)

I make it compatible with real world test data. You just have to collect
a large enough corpus of real world test data, plus whatever "standards"
you feel are worth looking at (general posix and man pages), and then
have at and do a lot of debugging as people use it and it breaks for them.

At a certain point using your users as guinea pigs is unavoidable. The
goal is to minimize it so you're not being _impolite_ to them by having
them find problems you could have found yourself.

Post by Andy Chu
Andy

Rob

dmccunney

2016-03-27 20:36:39 UTC

Post by Andy Chu
Sure I could just go change coreutils and bash ... I've been puzzling
through the bash source code and considering that.

<gack>

Post by Andy Chu
bash is like 175K+ lines of code, and If you wanted to support all of
it, I think you would end up with at least 50K LOC in the shell...
which is almost the size of everything in toybox to date. If on the
other hand you want a reasonable and compatible shell, rather than an
"extremely compatible" shell, it would probably be a lot less code...
hopefully less than 20K LOC (busybox ash is a 13K LOC IIRC, but it's
probably too bare)

I started on Unix System V R2, where the default shell installed as
/bin/sh was the Bourne shell *before* it acquired shell functions. I
used csh for a while as a better interactive environment, but didn't
try to write scripts in it. Then I discovered the Korn shell, and as
soon as it reached the point where it could successfully be installed
as /bin/sh, I did so on any machine I administered.

I had a Unix machine before I got an MSDOS PC, and spent time trying
to make the PC look as much like Unix as possible. The big win was
finding the MKS Toolkit, which had DOS versions of every Unix command
that made sense on a single-user, single tasking OS. The Toolkit
included a complete Korn shell replica that did everything save
asynchronous background tasks. When I was in the Korn shell
environment, you had to dig to discover you *weren't* on a Unix box.
:-)

When I encountered bash, I thought "Oh, wonderful. The FSF has
decided to make bash include *everything* from *all* other shells.
The result is a bloated mess."

I have it under Linux, and Windows courtesy of a git implementation,
and Android.

I don't know what Rob's intent is when he can finally get to toysh,
but I'd be delighted with something that looked and acted like ksh. I
see no point to re-implementing the wheel by making a bash clone. If
the Android user really needs bash, they can install a third party
port.

I have busybox on a quirky Linux system, and went through and replaced
the busybox commands with full versions from Ubuntu, because I wanted
full versions and not cut down subsets. Thus far, Rob seems to be
implementing toybox commands that *are* full versions supporting all
of the features of the stand alone versions, but still dramatically
reducing code size and rationalizing the design, so when Toybox hits
an actual 1.0 release, I won't feel a need to substitute stand alone
versions.

As far as I can tell, it will be possible to have a working Linux CLI
system with no GPL code save the kernel. I am *all* in favor.
______
Dennis
https://plus.google.com/u/0/105128793974319004519

Rob Landley

2016-03-28 06:43:43 UTC

I agree utf-8 is the right choice... The expr.c code from mksh has a
bunch of multibyte character support at the end, which makes you
https://github.com/MirBSD/mksh/blob/master/expr.c

If MirBSD had a public domain equivalent license I'd just suck the whole
thing in _now_ as a starting point for toysh, but alas it's using one of
the many, many conflicting "copy this license text verbatim into derived
works" clauses that leads to dozens of concatenated copies of the same
license text. (The "kindle paperwhite" had over 300 pages of this in
about->licenses. In comparison, toolbox's ~30 copies of
https://github.com/android/platform_system_core/blob/master/toolbox/NOTICE
is outright _restrained_. Heck, buildroot created _infrastructure_
("make legal-info") to concatenate all those license files together for
you.)

Not opening that can of worms. Sorry, but no.

Post by Andy Chu
bash seems to talk with some regret over support for multibyte
characters: http://aosabook.org/en/bash.html

Eh, do it right from the start and it's not that hard.

Did I mention I broke bash's command history with invalid utf8 sequences
so that cursor up/down advanced the cursor a couple spaces each time due
to the redraw now calculating the fontmetrics consistently? Fun times...

Yes, I have executable names (well, symlinks to sleep) that are invalid
utf8 sequences. Because top needed to display them. I also have a
command with a ) and a newline in the middle of it to confirm we're
doing the /proc/$$/stat field 2 parsing properly, _and_ that the result
doesn't corrupt the display. I may not have worked out automated
regression testing for this stuff yet, but I test it during development
and write down what the tests _were_...

And yes, those symlinks to sleep led to the todo item that if your
argv[0] is a symlink with a name we don't understand, we should resolve
the symlink to see if it's a name we _do_ understand, and repeat until
we've reached a non-symlink or done 99 resolves and assumed it's a loop.
I haven't decided whether or not to actually implement it yet, but it's
on the todo list.

Haven't decided yet because the scripts/single.sh builds don't have this
problem, but the fact they don't suck in the multiplexer logic also
means I can beef up the multiplexer infrastructure a bit without
worrying about "true" getting bigger when built standalone. (Fun trick:
make defconfig, make baseline, make menuconfig and switch off a command,
make bloatcheck, and that should show you what that command added to the
build. Kinda verbose to stick in the 'make help' output but I hope it's
obvious. I really need to add a FAQ page...)

I'll have to look at Aboriginal again...

I tried hard to make http://landley.net/aboriginal/about.html explain
everything, and before that I did a big long presentation at
https://speakerdeck.com/landley/developing-for-non-x86-targets-using-qemu trying
to explain everything. (There were a couple earlier attempts to explain
everything but they bit-rotted badly over the years.)

Post by Andy Chu
but for builds, don't you
just need chroots rather than full fledged containers? (i.e. you
don't really care about network namespaces, etc.)

A) requires root access on the host.

B) testing ifconfig will still screw up your system to the point of
requiring a reboot.

D) Yes, I created infrastructure that created custom chroots to test
stuff, way back in the busybox days (I linked to it a few posts back,
and it's still in
https://github.com/landley/toybox/blob/master/scripts/runtest.sh#L115 )
and in _theory_ I can use
https://github.com/landley/aboriginal/blob/master/sources/root-filesystem/sbin/zapchroot
to clean up after it when testing "mount" (I really want to add -R to
toybox umount to recursively unmount all the mount points under this
directory, it's on the todo list), but the problem at the time was if I
wanted to use a squashfs image to test mount how did I know if squashfs
was in the kernel? And some annoying distro (probably fedora) refused to
put sbin in the path for normal users so mke2fs wasn't there when doing
a single mount test...

Really, when testing the root stuff you need a known environment. It's
really, really brittle otherwise.

Post by Andy Chu
Oh one interesting thing I just found out is that you can use user
namespaces to fake root (compare with Debian's LD_PRELOAD fakeroot
solution)

Yes, but:

http://lists.linuxfoundation.org/pipermail/containers/2016-March/036690.html

Did I mention I worked a contract at Parallels in 2010 where I did
things like add container support to smbfs and expand the lxc web
documentation and help run the OpenVZ table at Scale explaining to
people why containers scale better than virtual machines do? (If you've
ever heard somebody describe a container as "chroot on steroids", it's
probably somebody who heard my Scale pitch. :)

Alas, said contract didn't last that long because they gave me a list of
a half-dozen things I could work on and I said "anything but NFS" and
they assigned me to work exclusively on NFS, which was very much not
fun. (Since I was working for a russian company, I did my work blogging
on livejournal, ala http://landley.livejournal.com/55534.html and
http://landley.livejournal.com/55727.html and
http://landley.livejournal.com/56285.html and... oh there's at least 3
dozen posts like that from that time period.)

On the bright side I learned a lot, from http://landley.net/lxc to the
recent
http://lists.landley.net/pipermail/toybox-landley.net/2016-March/004790.html
linking to one of my old livejournal entries from the period.

Post by Andy Chu
Last year, I was using this setuid root executable
(https://git.gnome.org/browse/linux-user-chroot/commit/), which is a
nice primitive for reproducible builds (i.e. not running lots of stuff
as root just because you need to chroot).

I think I've talked here before about wanting to add a "contain" command
to Linux that would do the whole chroot-on-steroids container setup
thing. I want to do a reasonable lxc replacement _not_ requiring funky
config files just because it was designed by IBM mainframe guys.

Although under the covers instead of doing chroot it would use
pivot_root because http://landley.net/notes-2011.html#02-06-2011

However, that's on my "after the 1.0 release" todo list, along with
adding support for "screen" and so on.

(There are a lot of interesting commands I'm pretty sure I could
implement cleanly in less than 1000 lines of code which are reasonable
to add to toybox. But the 1.0 goal is to make Android self-hosting so
that when the PC goes away we have some control over the process.)

That said, when I asked people about this at the Linux Plumber's
conference containers BOF last year, they basically said that Rocket is
based on systemd's container support, and everything else (including
Docker) is based on Rocket.

(The systemd maintainer has a very interesting talk about systemd's
containers at LinuxCon Japan in 2015, but I only caught the first five
minutes of it because it was scheduled opposite another talk I wanted to
see and I STUPIDLY assumed that just because ELC records every panel and
posts it to youtube, LinuxCon might record SOME of them. But no, ELC
used to be http://landley.net/notes-2010.html#27-10-2010 and LinuxCon
_started_ under the Linux Foundation rather than being acquired by it,
so of course LinuxCon won't let you see its panels without paying the
Linux Foundation a lot of money (or volunteering to give them content
people can only see if they pay the Linux Foundation a lot of money).

But anyway, toybox implementing something basically compatible with what
the systemd container commands do looks reasonably straightforward. It's
just probably post-1.0 on the todo list.

(I sat down with Lennart Pottering for an hour at that convention,
telling him outright I wanted toybox to clone the 20% of systemd that
gave people the option of not using it, and he did indeed walk me
through it. I hope that someday I get time to go through my notes and
that they make sense when I do. My main objection to systemd was always
https://news.ycombinator.com/item?id=7728692 and Lennart actually
pointed out there's a python implementation of it in the systemd git for
exactly that reason, but nobody seems to know about or use it.)

Often when you sit down with people you disagree with, what they're
doing makes sense in _their_ heads...

Post by Andy Chu
And I see in their README they are pointing to a Bazel (google build
system) tool that has an option to fake root with user namespaces.
Although I'm not sure you want to make that executable setuid root.

Right now, Android isn't likely to install toybox suid. (I have a todo
item to add "make suid" and "make nosuid" targets that are defconfig
filtered for the suid binaries and the non-suid binaries, so you can
install toybox as two binaries, only one of which needs the suid bit.)

Android predates containers, and instead hijacked user ids to mean
something else. Mixing container support in with that (and/or migrating
any of what they're currently doing _to_ containers) would have to
involve long talks with a bunch of android guys, probably in person, to
work out what the result should look like.

My impression from the one time I met Elliott in person was that he's
drinking from a firehose of contributions from large corporations and
that they're in a serious red queen's race which doesn't really give
them any time for long-term design planning. The major design
initiatives I've seen since are things like "try to fix the build
system". (The GUI gets love because it's what people see.)

Add to that a BILLION seats to maintain backwards compatibility with,
and you can see why I've bumped dealing with containers to after a 1.0
release that's good enough to build AOSP under Android.

Post by Rob Landley
A) Elaborate on "oddly conflates" please? I saw it as 'ensure this path
is there'.
B) [ ! -d "$DIR" ] && mkdir "$DIR"

-p, --parents no error if existing, make parent directories as needed
I guess you can think of the two things as related, but it's easy to
imagine situations where you only want to create a direct descendant
and it's OK if it exists.

Yes, it's ok if it exists? That's what mkdir -p does? (Exists is a
file?) I still don't understand, what behavior did you _prefer_?

mkdir -p "$(dirname "$DIR")" && mkdir -p "$DIR"

maybe?

Post by Andy Chu
B) has a race condition whereas checking errno doesn't, and mkdir $DIR
|| true has the problem that it would ignore other errors.

There's still a race condition, it being internal to mkdir doesn't
change anything. Somebody can rmdir the intermediate directories mkdir
creates before it can create the next one in sequence (in which case
mkdir -p can indeed return an error)..

Heck, somebody could come along an unlink your "mkdir onedir" right
after you create it. The -p isn't required for a race condition here...

(One of the reasons I use openat() and friends so heavily is I can get
the race conditions down to a dull roar.)

-f means "make sure this file is not there".

-f, --force ignore nonexistent files and arguments, never prompt
The first behavior makes it idempotent... the second disables the
check when writing over read-only files, which is unrelated to
idempotency (and yes I get that you're modifying the directory and not
the file, but that's the behavior rm already has)

It's not writing over them, it's removing a link to them. You can have
multiple links to the same file in a given filesystem, the storage is
only released when the last reference goes away (and open files count as
a reference, and there's a special case hack in the kernel that lets "ln
/proc/self/fd/3 filename" work if the target's the right filesystem
(still can't do a cross-filesystem hardlink, but those files aren't
_really _ symlinks) just so you CAN create a filesystem entry for a file
that was deleted but which you still have open.

(Checkpoint and restore in userspace need to be able to do that.
Containers again. I've vaguely wondered if CIRU is a good thing to add
to toybox when I get around to proper container support, but I was
mostly waiting for it to stop being a moving target. It's on the todo list!)

Post by Andy Chu
# behavior depends on whether bar is an existing directory, -T /
--no-target-directory fixes this I believe
$ cp foo bar

I do a lot of "cp/mv/rsync fromdir/. todir/." just to make this sort of
behavior shut up, but it's what posix said to do.

I left out the -r, and for mv it has to be mv/fromdir/* todir/ because
there's no -r there.
mv works differently, but for cp and rsync it means never winding up
with todir/todir and two copies of rsynced files instead of actually
properly rsynced.

Post by Andy Chu
$ ls
bar foo # empty dirs

A test which defeats the purpose of a syntax trying to ensure the
contents of one directory gets moved to the contents of another
directory, rather than creating a "from" directory under "to".

Post by Andy Chu
$ mv foo/. bar/.
mv: cannot move ‘foo/.’ to ‘bar/./.’: Device or resource busy
$ mv -T foo bar # now foo is moved over the empty dir bar

Definitely agreed -- but that's why I'm not creating an alternative,
but starting with existing behavior and adding to it.

I'm starting with existing behavior and trying to get it right. Given
how bloated the GNU versions are, and that original motivation of
busybox and toybox was to create stripped-down versions, adding behavior
is something I try not to take lightly.

Post by Andy Chu
That's one of
the reasons I am interested in toybox... to puzzle through all the
details of existing practice and standards where relevant, to make
sure I'm not inventing something worse :)

Good luck with that. I'm still not there myself. :)

Post by Andy Chu
The motivation the idempotency is a long story... but suffice to say
that people are not really using Unix itself for distributed systems.
They are building non-composable abstractions ON TOP of Unix as the
node OS (new languages and data formats -- Chef/Puppet being an
example; Hadoop/HDFS; and tons of Google internal stuff). AWS is a
distributed operating system; Google has a few distributed operating
systems as well. It's still the early days and I think they are
missing some lessons from Unix.

Mike Gancarz wrote a great book called "The Unix Philosophy".

Back when I was still on speaking terms with Eric Raymond I spent
several months crashing on the couch in his basement doing an "editing
pass" on The Art of Unix Programming that shrank the book from 9 to 20
chapters. (According to the introduction he almost made me a co-author.)

We tried to cover a lot of what we throught the unix philosophy was in
there too, except he dropped out of college to become a Vax admin the
same year my family got a commodore 64 for christmas (which was in my
room by new year's). He was an ex-math prodigy who burned out young and
dropped out of college, and I got a math minor for the same reason an
acrophobic person would go skydiving (I refused to be beaten by it).

We came at everything from opposite poles (8 bit vs minicomputer,
mathematician vs engineer), and the book got so much bigger because I
argued with him about everything (and then made him write up the results
of our arguments so the book would have a consistent voice).

Post by Andy Chu
Sure I could just go change coreutils and bash ... I've been puzzling
through the bash source code and considering that.

I try not to look at FSF source code. Life's too short for that.

Post by Andy Chu
If one of your goals is to support Debian, I think you should be
really *happy* that they went through all the trouble of porting their
shell scripts to dash, because that means all the shell scripts use
some common subset of bash and dash.

Allow me to refer you to an earlier rant on that topic, rather than
repeating it here:

http://lists.landley.net/pipermail/toybox-landley.net/2015-January/003831.html

Not happy doesn't begin to cover it.

Post by Andy Chu
Portability means that the
scripts don't go into all the dark corners of each particular
implementation.

There's a little more to it than that. :)

Post by Andy Chu
bash is like 175K+ lines of code, and If you wanted to support all of
it,

Who said I wanted to support all of it? I don't even know what all of it
is. I'm not supporting all of the sed extensions the gnu/dammit
implementation added, largely because I haven't seen anything using
them. (I have a todo item to add + to ranges, but don't remember what
that means at the moment because the note's over a year old. I should
look it up...)

The point is I want to support the parts people actually _use_.

Post by Andy Chu
I think you would end up with at least 50K LOC in the shell...

I mentioned in http://landley.net/aboriginal/history.html that looking
at the gnu implementation of cat and seeing it was 833 lines of C was
what started my looking at busybox source in the first place. At the
time busybox' was 65 lines, half of which was license boilerplate.
That's MORE than a factor of 10 difference, and that's not at all
unusual for gnu bloatware.

The only two toybox commands that wc -l puts over 1000 lines are sed.c
(1062 lines, 148 of which are help text), and ps.c (which is currently
implementing 5 commands, ps, top, iotop, pgrep, and pkill, only because
at the time I didn't know how to factor out the common infratructure
into /lib and now I do). Once I break top.c and pgrep.c out from ps.c
(and factor the common infrastructure out into lib/), the result should
be _well_ under 1000 lines.

Yes, there's common infrastructure in lib. Currently totalling 4380
lines for lib/* | wc -l". Add in main.c at the top level (228 lines) and
that means toybox's shared infrastructure is 4608 lines, for all of it
combined.

No, I don't think implementing a proper bash replacement will take
50,000 lines. I expect to keep it below 1/10th of that, but we'll have
to see...

Post by Andy Chu
which is almost the size of everything in toybox to date.

Which should be a hint that something is off, yes.

Possibly my idea of "reasonable bash replacement" differs from yours?

Post by Andy Chu
If on the
other hand you want a reasonable and compatible shell, rather than an
"extremely compatible" shell,

Look at the cp I did. Fully _half_ the command line options aren't
listed in posix. The result is 492 lines implementing cp, mv, and
install in one file. (Admittedly heavily leveraging 193 lines of dirtree.c.)

I have no idea how big the gnu cp implementation is (didn't look at it,
not gonna), but I'm guessing bigger than that.

Post by Andy Chu
it would probably be a lot less code...
hopefully less than 20K LOC (busybox ash is a 13K LOC IIRC, but it's
probably too bare)

Busybox ash is craptacular. When I was working on busybox there was
lash, hush, msh, and ash, and my response was to throw 'em all out and
start over with bbsh (which became toysh; the ONLY one of those four I
considered worth even trying to base bbsh on was lash, and that acronym
expanded to lame-ass shell.)

Admittedly they've patched it fairly extensively since then (a process I
haven't really been following), but there was never anything approaching
a clean design.

This is a long conversation, but I think you need an "escape hatch"
for declarative ones, and make has one -- the shell.

Mixing imperative and declarative contexts does not IMPROVE matters. It
means you need to control the order in which things happen, and can't.

You know how people work around that in make? By using it recursively,
DEFEATING THE PURPOSE OF MAKE. (http://aegis.sourceforge.net/auug97.pdf
was a lovely paper on that.)

Post by Andy Chu
If you don't
have an escape hatch, you end up with tortured programs that work
around the straightjacket of the declarative language.

The escape hatch IS working around the straightjacket. Mixing them
doesn't make things any less tortorous.

You'll notice that all my make wrapper is doing is calling a script.
Either make.sh, single.sh, or install.sh. And you can install all those
yourself if you want to.

I am using make's target syntax to determine whether or not a
prerequisite changed and thus we should call the build script, but I
could just as easily use find -newer (and internally the scripts do to
determine which files need rebuilding).

Post by Andy Chu
(But this is
not really related what I was suggesting with idempotency; this is
more of a semantic overload of "declarative")
Unfortunately GNU make's solution was not to rely on the escape hatch
of the shell, but to implement a tortured shell within Make (it has
looping, conditionals, functions, variables, string library functions,
etc. -- an entirely separate Turing complete language)

Android decided to replace make entirely, AOSP now uses something else.
("Ninja" or some such, elliott mentioned it here a while ago and I need
to add a note to the roadmap.)

I think I still need a "make" just because so much else uses make. But I
have an existing "make rant" about how ./configure; make; make install
EACH need a "cvs->git" style complete throw out and rethink. The most
recent time I linked to it here seems to be:

https://www.mail-archive.com/***@lists.landley.net/msg01915.html

The rant itself being these two posts:

http://lists.landley.net/pipermail/aboriginal-landley.net/2011-June/000859.html
http://lists.landley.net/pipermail/aboriginal-landley.net/2011-June/000860.html

Honestly, if I could work all this detail into design.html and friends I
would, but there's just too much to cover. At best I could get a link
collection...

Post by Andy Chu
Make's abstraction of lazy computation is useful (although it needs to
be updated to support directory trees and stuff like that). But most
people are breaking the model and using it for "actions" -- as
mentioned, the arguments to make should be *data* on the file system,
and not actions; otherwise you're using it for the wrong job and
semantics are confused (e.g. .PHONY pretty much tells you it's a hack)

One of the first programs I wrote for OS/2 circa 1992 was "bake", I.E.
"better make". The first line of its data file was the output file to
make (with any compiler options appended), the remaining lines were the
source files, and then IT parsed them to figure out how they fit
together and when what needed rebuilding.

Alas, the code is lost to history. There's the occasional trace of it
online
(https://fidonet.ozzmosis.com/echomail.php/os2prog/1b4f877b1d235864.html
for example) but I went off to other things...

I think the problem is that you expect things to actually work! :)

I expect to have to MAKE things work. With a spoon.

Post by Andy Chu
A lot of programmers have high expectations of software; users generally
have low expectations.
http://blog.regehr.org/archives/861 -- "How have software bugs trained
us? The core lesson that most of us have learned is to stay in the
well-tested regime and stay out of corner cases. Specifically, we will
... "

Scour down to the bare metal with fire and build back up from there.

Post by Andy Chu
http://zedshaw.com/2015/07/08/i-can-kill-any-computer/
I was definitely like this until I learned to stop changing defaults.
Nobody tests anything by the default configuration.

Yes, pretty much why toybox "defconfig" is what I expect people to use,
and I'm eliminating command sub-options.

However, _I_ test my code. I turn my ability to break everything on the
stuff I write, and go down weird little ratholes nobody else is ever
going to notice because they bother me. All the time.

Post by Andy Chu
Want to switch
window managers in Ubuntu? Nope, I got subtle drawing bugs related to
my video card. As penance for my lowered expectations, I try to work
on quality software...

I just upgraded my netbook from xubuntu 12.04 to xubuntu 14.04. This ate
3 days and I'm still not done (of _course_ the upgrade didn't work and
turned into a complete reinstall. Of course it was still broken after a
reinstall. Of course I tweeted an xkcd strip at somebody
(https://twitter.com/landley/status/714273505592750081) to explain the
current status of it a few hours ago).

But on the bright side when I'm done I may be able to move my email to
the machine I actually carry with me most of the time, which would be
nice. (Of COURSE the version of thunderbird in 12.04 can't read the data
files from the version in 14.04.)

Post by Rob Landley
Oh, and when $(commands) produce NUL bytes in the output, different
shells do different things with them. (Bash edits them out but retains
the data afterwards.)

Yeah hence my warning about trying to be too compatible with bash ...

Oh, I know what I'm in for. It's just one of those "I'm not locked in
with bash, bash is locked in with me" sorta things.

http://www.schlockmercenary.com/2006-07-13

However, I don't have space to open that can of worms _yet_.

Post by Andy Chu
Reading the aosabook bash article and referring to the source code
opened my eyes a lot.

Which article? (URL?)

Post by Andy Chu
sh does have a POSIX grammar, but it's not
entirely useful,

Yeah, I noticed. I _have_ read the posix shell spec all the way through
(although it was the previous release, susv3). I also printed out the
bash man page and read through _that_ (although it was bash 2.x)... You
may notice a theme here. I did extensive research on this... circa 2006.

Post by Andy Chu
as he points out, and I see what he means when he
says that using yacc was a mistake (top-down parsing fits the shell
more than bottom-up).

I don't intend to use yacc.

Post by Andy Chu
On the other hand, writing a shell parser and lexer by hand is a
nightmare too (at least if you care about bugs, which most people seem
not to).

Meh, I'm a huge fan of Fabrice Bellard's original tinycc which took
exactly that approach to C. (Circa 2006. I maintained my own fork for a
while, http://landley.net/code/tinycc explains why I stopped. The
current stuff is just _nuts_.)

Post by Andy Chu
I'm experimenting with using 're2c' for my shell lexer,
which seems promising.

I'm going with hand-written parser. It's not that hard to do it right.

Also keep in mind I have incentive from $DAYJOB to make it work nicely
on nommu systems. :)

Post by Andy Chu
Reading the commit logs of bash is interesting... all of its features
seem to be highly coupled. There are lots of lists like this where
http://git.savannah.gnu.org/cgit/bash.git/tree/RBASH . The test
matrix would be insane.

Yup.

I intend to do more or less what posix wants first (with my standard
"document where I deviate from posix being stupid and move on"), and
then add lots of things like curly bracket filenames and <(subshell)
arguments and so on.

If somebody wants -r, they can poke me on the list. Containers exist now.

Other than "loop", what are you missing?

1) People keep saying to avoid shell scripts for serious "software
engineering" and distributed systems.

People keep saying to avoid C for the same reason. Those people are wrong.

Post by Andy Chu
I know a lot of the corner
cases and a lot of people don't, so that could be a defensible
position. You can imagine a shell and set of tools that were a lot
more robust (e.g. pedantically correct quoting is hard and looks ugly,
but also more than that)
2) Related: being able to teach shell to novices with a straight face.
Shell really could be an ideal first computing language, and it was
for many years. Python or even JavaScript is more favored now
(probably rightly).

Python 3.0 eliminated my interest in python 2.0 _and_ 3.0.

Javascript has the advantage of it being in every web browser. It's a
giant mess of a langauge other than that, and not really designed for
use outside its initial framework (node.js notwithstanding).

Post by Andy Chu
But honestly shell has an advantage in that to
*DO* anything, you need to talk to a specific operating system, and
Python and JavaScript have this barrier of portability. But the bar
has been raised in terms of usability -- e.g. memorizing all these
single letter flag names is not really something people are up to.

Education is a can of worms I'm not going into right now. (I say that
having taught night courses at the local community college for a couple
years way back when. This email is long enough as it is...)

Post by Andy Chu
3) Security features for distributed systems ... sh is obviously not
designed for untrusted input (including what's on the file system).
I could get into a lot of details but I guess my first task is to come
up with something "reasonably" compatible with sh/bash, but with a
code structure that's extensible.

Shells are interesting because there's a giant pile of existing shell
scripts that they can run, and a lot of people with knowledge of how to
write shell scripts in the current syntax. Your new syntax benefits from
neither of those.

Post by Andy Chu
FWIW toybox code is definitely way cleaner than bash, though I
wouldn't necessarily call it extensible. You seem to figure out the
exact set of semantics you want, and then find some *global* minimum
in terms of implementation complexity,

Yeah, sort of the point.

Post by Andy Chu
which may make it harder to add
big features in the future (you would have to explode and compress
everything again).

I've done it several times already over the course of the project.

I'm sure I have an existing rant on this, the phrase I normally use is
"infrastructure in search of a user" so lemme google for that..

And of course the first hit is
http://lists.landley.net/pipermail/toybox-landley.net/2013-April/000882.html
which is the first message in the ifconfig cleanup series from
http://landley.net/toybox/cleanup.html

Post by Andy Chu
I suppose that is why this silly -n patch requires
recalling everything else about cp/mv, like --remove-destination :)

Understanding all the possible combinations so you can test them and get
the interactions right also requires that.

Post by Andy Chu
But I definitely learned something from this style even though I'm not
sure I would use it for most projects!

Oh I don't use it for all projects either.

I should have called toybox "dorodango" (ala
http://www.dorodango.com/about.html) because it's ALL about incessant
polishing.

Usable versions of these command line utilities already exist. Busybox
already existed when I started Toybox. The gnu tools existed when Erik
Andersen started modern busybox development. The BSD tools existed when
gnu started. The system V tools existed when bsd started.

Along the way there's been a dozen independent implementations that
forked off that aren't in that list. The first full from-scratch unix
clone was Coherent from the mark williams company in 1980, which
included its own kernel, compiler, libc, and command line utilities.
Linux forked off of Minix, another clean room clone of the entire OS
including kernel, compiler, libc, and command line utilities.

I'm trying to do a _BETTER_ job.

Post by Andy Chu
Andy

Rob

Andy Chu

2016-04-15 10:51:47 UTC

Post by Rob Landley
If MirBSD had a public domain equivalent license I'd just suck the whole
thing in _now_ as a starting point for toysh, but alas it's using one of
the many, many conflicting "copy this license text verbatim into derived

I dunno, mksh seems good in some ways (I've been reading the code as
mentioned). But it's also 31K lines of code... That doesn't seem like
your style :)

Post by Andy Chu
bash seems to talk with some regret over support for multibyte
characters: http://aosabook.org/en/bash.html

Eh, do it right from the start and it's not that hard.

I'm not a unicode expert, but when I say multibyte characters I mean
16 or 32-bit wide characters in memory. I think that is the
complication with mksh $(( )) and with bash. utf-8 is mostly trivial
to support if you use the SAME representation in memory. char* IS
utf-8!!!

I didn't quite understand this until some rants from the Go/Plan 9
guys at work, actually talking about Python's unicode strategy. The
whole point of utf-8 is that you don't have modify your existing C
code to use it -- unless you require O(1) code point access, i.e.
indexing, which surprisingly little real code does.

Go uses a rune library for unicode, which is quite different than
Python's unicode vs. str or C/C++ wchar_t and all that nonsense.

So my hope is that I don't have to do anything special in my shell for
Unicode -- I will just support utf-8, which doesn't really require any
changes. So far I don't see why that isn't viable.

Post by Rob Landley
But anyway, toybox implementing something basically compatible with what
the systemd container commands do looks reasonably straightforward. It's
just probably post-1.0 on the todo list.
(I sat down with Lennart Pottering for an hour at that convention,
telling him outright I wanted toybox to clone the 20% of systemd that
gave people the option of not using it, and he did indeed walk me
through it. I hope that someday I get time to go through my notes and
that they make sense when I do. My main objection to systemd was always
https://news.ycombinator.com/item?id=7728692 and Lennart actually
pointed out there's a python implementation of it in the systemd git for
exactly that reason, but nobody seems to know about or use it.)

That link seems to have broken, found it here:
http://www.landley.net/notes-2014.html#23-04-2014

Anyway the short summary is I don't like systemd's lack of modularity
either. And in addition I don't agree with their goal of getting rid
of shell scripts in the bootstrap, although I understand the
motivation.

That is one of the motivations for my shell... to make it so shell
scripts in the boot process *aren't* a problem. There are lots of
excuses (and valid reasons) not to write shell now, and I think those
issues are fixable.

Post by Rob Landley
Add to that a BILLION seats to maintain backwards compatibility with,
and you can see why I've bumped dealing with containers to after a 1.0
release that's good enough to build AOSP under Android.

Yes, I definitely agree with that goal... IMO containers need to
settle down before toybox has something to implement. I think rocket
is still nascent, and if they can extract the good parts of systemd's
container support that will be good for toybox.

Oh yes I see that! I've recommended this book to lots of people and
I've returned to it many times.

As I said, all these nascent Cloud OSes are ostensibly using Unix, but
are missing some lessons of architecture from Unix and Internet
protocols.

Though an even better example is Android. Android is Linux but NOT
Unix. Android is a operating system meant to run only Java programs.
You don't fork() and read the shebang line in Android; you spawn a new
thread in the Java VM. You don't use pipes or sockets or file
descriptors; you use intents and activities, which are Java-specific
messaging API.

Now I won't go as far as to say that's wrong, since Unix GUIs have not
been a great success... but I do think this architecture has caused
some growing pains in the ecosystem (i.e. due to lack of language
heterogeneity).

Post by Rob Landley
Who said I wanted to support all of it? I don't even know what all of it
is. I'm not supporting all of the sed extensions the gnu/dammit
implementation added, largely because I haven't seen anything using
them. (I have a todo item to add + to ranges, but don't remember what
that means at the moment because the note's over a year old. I should
look it up...)
The point is I want to support the parts people actually _use_.

Well you have to define what the use cases are ... it seems like your
goals have encompassed "everything", and are also growing. This might
require reimplementing all of bash and coreutils, etc. But if the
goal is just Aboriginal -- e.g. implement enough of a shell to rebuild
bash, make, etc. and enough of make to rebuild bash, make, etc. Then
that task seems a lot more feasible and well-defined.

Bash is so widely deployed that to a first approximation every feature
is used *somewhere*...

Post by Rob Landley
No, I don't think implementing a proper bash replacement will take
50,000 lines. I expect to keep it below 1/10th of that, but we'll have
to see...
Possibly my idea of "reasonable bash replacement" differs from yours?

Yeah we'll have to see... I'm working on it :) I think it's probably
possible to write a shell that can rebuild basic GNU packages (which
tend to use autotools) in 7-10K lines of code (but no less than that).
But interactivity adds a lot of requirements. The biggest file in
mksh is "edit.c", which is 5500 lines by itself!

FWIW I have read most of the POSIX spec. And also I realized I have a
printed-out copy if ut in the Apress book "Portable Shell Scripting"
by Seebach, which I bought in 2010 or so ... I have been wanting to
write a shell for at least that long! And as mentioned, I've been
testing bash and dash against it, and it's pretty accurate and
valuable. I'm guessing you will see occasionally see bashisms like [[
in build scripts. I actually didn't realize that $(()) was in POSIX.

Post by Rob Landley
Busybox ash is craptacular. When I was working on busybox there was
lash, hush, msh, and ash, and my response was to throw 'em all out and
start over with bbsh (which became toysh; the ONLY one of those four I
considered worth even trying to base bbsh on was lash, and that acronym
expanded to lame-ass shell.)

As mentioned, there does seem to be a great propensity to use gotos,
macros, globals, and long functions (approaching 1000 lines) in many
shell implementations...

Post by Rob Landley
Android decided to replace make entirely, AOSP now uses something else.
("Ninja" or some such, elliott mentioned it here a while ago and I need
to add a note to the roadmap.)

Ninja is very nice -- its lexer uses re2c, which inspired me to use
it. Search for re2c here:

https://github.com/ninja-build/ninja/blob/master/src/lexer.in.cc

I actually started with this for my shell lexer, and a tiny bit of the
Ninja code is left.

I built AOSP and Cyanogen from scratch like 18 months ago with GNU
make. My understanding is that it still uses the same Android.mk
makefile fragments, but they are compiled to Ninja files by Kati:

https://github.com/google/kati

Kati has a single-threaded executor, but also a compiler to Ninja text
format, which does fast parallel execution and fast incremental
rebuilds.

Your resistance to C++ is understandable since in some ways it's
anti-Unix. I felt the same way for a long time -- I actually sold my
C++ books when starting at Google, hoping not to write in it...

But the Google style helps a lot (some C++ purists deride it as "C
with classes"). Ninja, Kati, and my shell are all written
Google-style C++ (no exceptions, StringPiece, unit test framework,
etc.) Also, C++ has evolved and gotten better. People didn't really
*know* how to write C++ in the 90's ... it took the industry awhile to
collectively learn it!

It's basically like Git. It doesn't make any global sense, and it has
a crapton of features, but it does have features you can ONLY get
there. And that makes it legitimately the best tool for many jobs.

At least there is goodness there... there are plenty of things that
are bloated, and widely used, but at the same time are NOT the best
tool for any job :)

Post by Andy Chu
I'm experimenting with using 're2c' for my shell lexer,
which seems promising.

I'm going with hand-written parser. It's not that hard to do it right.

As mentioned, my parser is hand-written, but I'm basically porting it
from a machine-checked ANTLR grammar, which is based on the POSIX
grammar.

The lexer is partially generated with re2c, which is absolutely the
right choice IMO... I'll publish the code within a few weeks and I
think you'll see what I mean. It's completely different than lex/flex
-- the Ninja code gives a good idea of how it works. I think you're
being too dismissive without seeing the code :) There is something to
learn here.

Post by Andy Chu
1) People keep saying to avoid shell scripts for serious "software
engineering" and distributed systems.

People keep saying to avoid C for the same reason. Those people are wrong.

Yeah but that's not really a useful attitude... as mentioned, systemd
has the goal of eliminating shell scripts from the boot process.
Engineers at Google mostly don't use shell scripts, and sometimes end
up with 1000 lines of C++ instead of a 20 lines of shell... Most cloud
stuff tries to avoid shell, but still ends up with something even
worse: shell embedded in config files like YAML and JSON and whatnot.

There ARE valid reasons not to use shell; I think the language needs
to be updated a bit.

Post by Rob Landley
Shells are interesting because there's a giant pile of existing shell
scripts that they can run, and a lot of people with knowledge of how to
write shell scripts in the current syntax. Your new syntax benefits from
neither of those.

As mentioned I'm starting with POSIX; I haven't started implementing
the new language. I think it will actually be easy to write an
auto-converter, although I don't want to get too far ahead of myself.
I will publish the code soon.

Post by Rob Landley
I should have called toybox "dorodango" (ala
http://www.dorodango.com/about.html) because it's ALL about incessant
polishing.

Yes I found it quite easy to hack on the code and understand it (well
except for the sed variable names :) ) I can tell you put a lot of
care into many parts of the code, and that is definitely refreshing
and is paying dividends.

But I also think you're being a little too precious about some code
that is in bad shape... a lot of the contributed C code and shell
scripts are frankly bad, and buggy (not just stuff in pending/
either). As mentioned, I have no doubt that I can find dozens of bugs
in there if I were to have continued on my path with the test harness.

I was planning to add a whole bunch of tests, clean things up, and do
stuff like make tests a subprocess and define the test environment as
we discussed. But I don't mind that I got derailed at all, because I
got to work on the shell that I've been wanting to do for years :)

But yeah I am worried that you have signed yourself up for decades
worth of work, without a good parallelization strategy :) I think
tests will help other developers contribute, and also make code
reviews faster. When I do code reviews, I look at the tests first.
If I don't understand something, I ask the person to write a test, and
then I read that. That's often way faster than a lot of back and
forth through e-mail. Show me the running proof!

After there are tests, it's super easy to refactor the code for style,
and to do aggressive performance or size optimizations. I don't feel
comfortable rewriting other people's code without good tests. It's
very easy to introduce bugs.

It's too early to say but I think you will probably find some of my
shell work useful. I would think of it this way: even if someone else
implements shell and make for you, you STILL have many years of work
left on toybox :) I would say that a compatible shell and make are
not less than a year's work *each*.

Andy

enh

2016-04-15 17:16:00 UTC

I dunno, mksh seems good in some ways (I've been reading the code as
mentioned). But it's also 31K lines of code... That doesn't seem like
your style :)

things to avoid/reasons why i'd happily get rid of mksh in Android:

* having lots of builtins that don't behave like the real things.
* not letting me #define to disable those builtins (i think printf
might be an exception, but i'd like to disable everything that doesn't
have to be built-in).
* requiring perl to run the integration tests.
* zero unit tests.
* breaking clang builds to save 200 bytes with manual string sharing
(this is why i'm a release behind right now).
* having a mailing list that rejects mail from gmail (this is why
upstream doesn't know).
* not using git.
* not distinguishing interactive login shells (as far as i can tell)
to allow optimizing the system rc file.

probably other stuff, but those are the ones that spring to mind. the
first and last are the ones that users notice. the rest are things
that affect maintenance.

Post by Andy Chu
bash seems to talk with some regret over support for multibyte
characters: http://aosabook.org/en/bash.html

Eh, do it right from the start and it's not that hard.

I'm not a unicode expert, but when I say multibyte characters I mean
16 or 32-bit wide characters in memory. I think that is the
complication with mksh $(( )) and with bash. utf-8 is mostly trivial
to support if you use the SAME representation in memory. char* IS
utf-8!!!
I didn't quite understand this until some rants from the Go/Plan 9
guys at work, actually talking about Python's unicode strategy. The
whole point of utf-8 is that you don't have modify your existing C
code to use it -- unless you require O(1) code point access, i.e.
indexing, which surprisingly little real code does.
Go uses a rune library for unicode, which is quite different than
Python's unicode vs. str or C/C++ wchar_t and all that nonsense.
So my hope is that I don't have to do anything special in my shell for
Unicode -- I will just support utf-8, which doesn't really require any
changes. So far I don't see why that isn't viable.

http://www.landley.net/notes-2014.html#23-04-2014
Anyway the short summary is I don't like systemd's lack of modularity
either. And in addition I don't agree with their goal of getting rid
of shell scripts in the bootstrap, although I understand the
motivation.
That is one of the motivations for my shell... to make it so shell
scripts in the boot process *aren't* a problem. There are lots of
excuses (and valid reasons) not to write shell now, and I think those
issues are fixable.

Oh yes I see that! I've recommended this book to lots of people and
I've returned to it many times.
As I said, all these nascent Cloud OSes are ostensibly using Unix, but
are missing some lessons of architecture from Unix and Internet
protocols.
Though an even better example is Android. Android is Linux but NOT
Unix. Android is a operating system meant to run only Java programs.
You don't fork() and read the shebang line in Android; you spawn a new
thread in the Java VM. You don't use pipes or sockets or file
descriptors; you use intents and activities, which are Java-specific
messaging API.
Now I won't go as far as to say that's wrong, since Unix GUIs have not
been a great success... but I do think this architecture has caused
some growing pains in the ecosystem (i.e. due to lack of language
heterogeneity).

Well you have to define what the use cases are ... it seems like your
goals have encompassed "everything", and are also growing. This might
require reimplementing all of bash and coreutils, etc. But if the
goal is just Aboriginal -- e.g. implement enough of a shell to rebuild
bash, make, etc. and enough of make to rebuild bash, make, etc. Then
that task seems a lot more feasible and well-defined.
Bash is so widely deployed that to a first approximation every feature
is used *somewhere*...

Yeah we'll have to see... I'm working on it :) I think it's probably
possible to write a shell that can rebuild basic GNU packages (which
tend to use autotools) in 7-10K lines of code (but no less than that).
But interactivity adds a lot of requirements. The biggest file in
mksh is "edit.c", which is 5500 lines by itself!
FWIW I have read most of the POSIX spec. And also I realized I have a
printed-out copy if ut in the Apress book "Portable Shell Scripting"
by Seebach, which I bought in 2010 or so ... I have been wanting to
write a shell for at least that long! And as mentioned, I've been
testing bash and dash against it, and it's pretty accurate and
valuable. I'm guessing you will see occasionally see bashisms like [[
in build scripts. I actually didn't realize that $(()) was in POSIX.

As mentioned, there does seem to be a great propensity to use gotos,
macros, globals, and long functions (approaching 1000 lines) in many
shell implementations...

Post by Rob Landley
Android decided to replace make entirely, AOSP now uses something else.
("Ninja" or some such, elliott mentioned it here a while ago and I need
to add a note to the roadmap.)

Ninja is very nice -- its lexer uses re2c, which inspired me to use
https://github.com/ninja-build/ninja/blob/master/src/lexer.in.cc
I actually started with this for my shell lexer, and a tiny bit of the
Ninja code is left.
I built AOSP and Cyanogen from scratch like 18 months ago with GNU
make. My understanding is that it still uses the same Android.mk
https://github.com/google/kati
Kati has a single-threaded executor, but also a compiler to Ninja text
format, which does fast parallel execution and fast incremental
rebuilds.
Your resistance to C++ is understandable since in some ways it's
anti-Unix. I felt the same way for a long time -- I actually sold my
C++ books when starting at Google, hoping not to write in it...
But the Google style helps a lot (some C++ purists deride it as "C
with classes"). Ninja, Kati, and my shell are all written
Google-style C++ (no exceptions, StringPiece, unit test framework,
etc.) Also, C++ has evolved and gotten better. People didn't really
*know* how to write C++ in the 90's ... it took the industry awhile to
collectively learn it!
It's basically like Git. It doesn't make any global sense, and it has
a crapton of features, but it does have features you can ONLY get
there. And that makes it legitimately the best tool for many jobs.
At least there is goodness there... there are plenty of things that
are bloated, and widely used, but at the same time are NOT the best
tool for any job :)

Post by Andy Chu
I'm experimenting with using 're2c' for my shell lexer,
which seems promising.

I'm going with hand-written parser. It's not that hard to do it right.

As mentioned, my parser is hand-written, but I'm basically porting it
from a machine-checked ANTLR grammar, which is based on the POSIX
grammar.
The lexer is partially generated with re2c, which is absolutely the
right choice IMO... I'll publish the code within a few weeks and I
think you'll see what I mean. It's completely different than lex/flex
-- the Ninja code gives a good idea of how it works. I think you're
being too dismissive without seeing the code :) There is something to
learn here.

Post by Andy Chu
1) People keep saying to avoid shell scripts for serious "software
engineering" and distributed systems.

People keep saying to avoid C for the same reason. Those people are wrong.

Yeah but that's not really a useful attitude... as mentioned, systemd
has the goal of eliminating shell scripts from the boot process.
Engineers at Google mostly don't use shell scripts, and sometimes end
up with 1000 lines of C++ instead of a 20 lines of shell... Most cloud
stuff tries to avoid shell, but still ends up with something even
worse: shell embedded in config files like YAML and JSON and whatnot.
There ARE valid reasons not to use shell; I think the language needs
to be updated a bit.

Post by Rob Landley
I should have called toybox "dorodango" (ala
http://www.dorodango.com/about.html) because it's ALL about incessant
polishing.

Yes I found it quite easy to hack on the code and understand it (well
except for the sed variable names :) ) I can tell you put a lot of
care into many parts of the code, and that is definitely refreshing
and is paying dividends.
But I also think you're being a little too precious about some code
that is in bad shape... a lot of the contributed C code and shell
scripts are frankly bad, and buggy (not just stuff in pending/
either). As mentioned, I have no doubt that I can find dozens of bugs
in there if I were to have continued on my path with the test harness.
I was planning to add a whole bunch of tests, clean things up, and do
stuff like make tests a subprocess and define the test environment as
we discussed. But I don't mind that I got derailed at all, because I
got to work on the shell that I've been wanting to do for years :)
But yeah I am worried that you have signed yourself up for decades
worth of work, without a good parallelization strategy :) I think
tests will help other developers contribute, and also make code
reviews faster. When I do code reviews, I look at the tests first.
If I don't understand something, I ask the person to write a test, and
then I read that. That's often way faster than a lot of back and
forth through e-mail. Show me the running proof!
After there are tests, it's super easy to refactor the code for style,
and to do aggressive performance or size optimizations. I don't feel
comfortable rewriting other people's code without good tests. It's
very easy to introduce bugs.
It's too early to say but I think you will probably find some of my
shell work useful. I would think of it this way: even if someone else
implements shell and make for you, you STILL have many years of work
left on toybox :) I would say that a compatible shell and make are
not less than a year's work *each*.
Andy
_______________________________________________
Toybox mailing list
http://lists.landley.net/listinfo.cgi/toybox-landley.net

--
Elliott Hughes - http://who/enh - http://jessies.org/~enh/
Android native code/tools questions? Mail me/drop by/add me as a reviewer.

Andy Chu

2016-04-15 19:40:00 UTC

OK interesting -- can you say more about the use cases of shell and
toybox in Android?

For example, why does Android need toybox expr? Does some shell
script use it? I only know a little about Android (from building AOSP
18 months ago), but my impression was that the use of shell scripts
was limited, and they were all "in-tree" so you can patch them.

So if you had some upstream packages that used shell for an init
script or something, you could patch it to use $(()) in mksh instead
of expr, and avoid expr. Or maybe it's just not worth the effort to
do that, and it's easier to have expr on there. (I thought I was the
only one who used expr :) )

Who uses the shell? I would have guessed it's for the boot process
mainly? I don't think people are installing compilers on Android
phones and building GNU packages, but I could be mistaken. And I
would think the main use cases right now for toybox are on the device,
not for platform and NDK builds, no? But maybe it is useful for build
with toybox to make the build hermetic. I know all the compilers are
vendored in.

I imagine the big users now are not end users or app developers, but
platform developers (e.g. Samsung engineers) who need to tweak things
when bringing up the OS on new hardware. I'm able to run toybox on my
Nexus 5x via ConnectBot, though honestly I'm not sure what to do with
it since it's limited to an individual app UID :)

I think it would be cool to have my shell on Android, and it's written
in Google-style C++ like Ninja like I said, with unit tests, no
globals and (tasteful) dependency injection and everything. It's only
2 weeks of work and 2800 lines of code now, so that's a long ways off
:)

But I'm curious about the use cases. I think I can probably get some
other shell users at Google to contribute, because I've done a lot of
the hard parts in a clean way. Lexing and parsing shell are
particularly hard IMO; compare with mksh's yylex() and associated
macros in lex.c, which is about 1000 lines in a single function, and
it only does part of the job my lexer does.

thanks,
Andy

Rob Landley

2016-04-15 20:35:14 UTC

Post by Andy Chu
OK interesting -- can you say more about the use cases of shell and
toybox in Android?
For example, why does Android need toybox expr?

A) It's in posix,

B) It shares infrastructure with the shell $((math)) functions,

C) It's used rather a lot in autoconf and shell scripts. Let's see,the
first package that always broke in my linux from scratch builds when I
was making busybox work with that was binutils...

$ grep -wr expr binutils | wc -l
27429

$ grep -w expr binutils/configure | head
if expr a : '$a$' >/dev/null 2>&1; then
as_expr=expr
as_lineno_3=`(expr $as_lineno_1 + 1) 2>/dev/null`
as_lineno_3=`(expr $as_lineno_1 + 1) 2>/dev/null`
if expr a : '$a$' >/dev/null 2>&1; then
as_expr=expr
ac_optarg=`expr "x$ac_option" : 'x[^=]*=$.*$'`
ac_feature=`expr "x$ac_option" : 'x-*disable-$.*$'`
expr "x$ac_feature" : ".*[^-_$as_cr_alnum]" >/dev/null &&
ac_feature=`expr "x$ac_option" : 'x-*enable-$[^=]*$'`

$ grep -w expr binutils/*.sh |head
extracted_serial=`expr $extracted_serial + 1`
match_pattern_regex=`expr "$deplibs_check_method" : "$1 $.*$"`
major=`expr $current - $age`
current=`expr $number_major + $number_minor`
current=`expr $number_major + $number_minor - 1`
major=.`expr $current - $age`
minor_current=`expr $current + 1`
major=`expr $current - $age + 1`
iface=`expr $revision - $loop`
loop=`expr $loop - 1`

Post by Andy Chu
Does some shell script use it?

In the current Linux kernel build? (Well, monday's git, anyway?) Yes:

$ grep -w expr Makefile scripts/decodecode
Makefile: expr $(VERSION) \* 65536 + 0$(PATCHLEVEL) \* 256 +
0$(SUBLEVEL)); \
scripts/decodecode:width=`expr index "$code" ' '`
scripts/decodecode:marker=`expr index "$code" "\<"`
scripts/decodecode: marker=`expr index "$code" "\("`

Post by Andy Chu
I only know a little about Android (from building AOSP
18 months ago), but my impression was that the use of shell scripts
was limited, and they were all "in-tree" so you can patch them.

If you want to turn Android into a build environment that has at least
the _option_ to build existing packages, you need expr, yes.

Post by Andy Chu
So if you had some upstream packages that used shell for an init
script or something, you could patch it to use $(()) in mksh instead
of expr, and avoid expr.

You could, yes. But toybox's shell needs a $(()) implementation and it
might as well share code with expr.

Post by Andy Chu
Or maybe it's just not worth the effort to
do that, and it's easier to have expr on there. (I thought I was the
only one who used expr :) )

It's in posix. It's used by shell scripts and makefiles in the wild.
Somebody other than me submitted an implementation to toybox.

Post by Andy Chu
Who uses the shell? I would have guessed it's for the boot process
mainly?

Didn't he just said that a shell problem broke his build?

Doesn't adb launch a shell?

Post by Andy Chu
I don't think people are installing compilers on Android
phones and building GNU packages, but I could be mistaken. And I
would think the main use cases right now for toybox are on the device,
not for platform and NDK builds, no?

Toybox is not at 1.0. Toybox is not ready to carry the weight of a
self-hosting build environment by itself yet. That's why it's not at 1.0.

Post by Andy Chu
But maybe it is useful for build
with toybox to make the build hermetic. I know all the compilers are
vendored in.
I imagine the big users now are not end users or app developers, but
platform developers (e.g. Samsung engineers) who need to tweak things
when bringing up the OS on new hardware. I'm able to run toybox on my
Nexus 5x via ConnectBot, though honestly I'm not sure what to do with
it since it's limited to an individual app UID :)

On my nexus 5 I installed "terminal" out of the app store (still dunno
why that's not there by default since it's in the AOSP build) and got a
shell prompt with toybox available.

Aboriginal Linux never requires root to build. In theory AOSP shouldn't
either. Instead I provide ownership and permission overrides to the
initramfs, squashfs, or ext2 filesystem image generators when packaging
up the directory. In theory whatever image type android's making could
similarly do that. An awful lot of the requirements for that went away
when devtmpfs happened, because now you don't need to mknodin the new
filesystem.

Speaking of which, I have a pending todo item to make
CONFIG_DEVTMPFS_MOUNT apply to initmpfs. I should finish and submit that...

Toybox needs an integrated shell and isn't going to import c++ code, but
if android winds up using your shell instead of that one it's Elliott's
call, not mine.

Writing a shell isn't actually _hard_, what it is is _elaborate_.
There's a lot of surface area to cover, and I want to do it _right_
which means not stopping and restarting 30 times while interrupted by
something else. (Once I open this can of worms it's going to eat 6
months of my life.)

That said I may need to start on this for $DAYJOB soon, since the only
nommu shell we've got right now is busybox hush and they don't want to
ship gpl code in userspace either, especially from a project with a
history of litigation. (Can't say I blame 'em...)

Post by Andy Chu
But I'm curious about the use cases. I think I can probably get some
other shell users at Google to contribute, because I've done a lot of
the hard parts in a clean way. Lexing and parsing shell are
particularly hard IMO; compare with mksh's yylex() and associated
macros in lex.c, which is about 1000 lines in a single function, and
it only does part of the job my lexer does.

*shrug* Toybox will eventually have a shell anyway, but if you want a
different one written in a different language, have fun?

Post by Andy Chu
thanks,
Andy

Rob

Andy Chu

2016-04-15 22:38:18 UTC

Post by Rob Landley
A) It's in posix,
B) It shares infrastructure with the shell $((math)) functions,
C) It's used rather a lot in autoconf and shell scripts. Let's see,the
first package that always broke in my linux from scratch builds when I
was making busybox work with that was binutils...

Those all seemed like reasons that *toybox* has expr, not Android. I
was wondering why expr was pulled out of pending, rather than just
having a defconfig build on Android.

But Elliot basically answered my question below -- it seems they're
aiming for a POSIX compliant environment on Android. (That certainly
wasn't the case when Android launched; it was Linux but not Unix).

Anyway, I'm definitely in favor of Android being more Unix-like, to
the point of spending my time writing code for it (and it's not paid).
I WANT expr to be used of course!

If you want to turn Android into a build environment that has at least
the _option_ to build existing packages, you need expr, yes.

As I said, I'm definitely for that goal, but AFAIK nobody's building
the Linux kernel on an Android device now... I'm trying to wrap my
head around things that have a 1 year time frame vs things that have a
5 or 10 year time frame.

Post by Andy Chu
Who uses the shell? I would have guessed it's for the boot process
mainly?

Didn't he just said that a shell problem broke his build?

I think he was saying mksh doesn't compile with Clang, which makes it
annoying to maintain. I was asking a different question.

I wouldn't really agree with that. You could say programming isn't
hard then either. IMO one of the most important parts of programming
is controlling complexity, and shell is inherently complex. It's also
big, and big things are hard.

In addition, parsing is not a solved problem, and the shell language
shows that (and bash shows that; I refer to the aosabook chapter
again.) It's a fair bit more elaborate than what they teach you about
parsing in compiler class.

Post by Rob Landley
That said I may need to start on this for $DAYJOB soon, since the only
nommu shell we've got right now is busybox hush and they don't want to
ship gpl code in userspace either, especially from a project with a
history of litigation. (Can't say I blame 'em...)

*shrug* Toybox will eventually have a shell anyway, but if you want a
different one written in a different language, have fun?

I am having a lot of fun :) I have finally gotten to work on
something I've wanted to work on for years. In 2009, I actually tried
to convince Rob Pike and Russ Cox that there needs to be a shell
paired with Go, since Go is an updated C (and I'm sure they didn't
take it seriously, which was wise.)

You talk about your TODO list in a lot of messages... it seems to be a
decade or two long. So I don't really think there is any harm in
parallelizing the effort. My goals overlap with yours but aren't
identical. I suspect it may converge at some point, but if not that's
fine too. At the very least, the research I'm doing -- looking into 3
or 4 shell implementations and POSIX -- will be useful for what you
want to do. Hopefully I will find some time to make a writeup of it.

Andy

enh

2016-04-15 20:51:52 UTC

Post by Andy Chu
OK interesting -- can you say more about the use cases of shell and
toybox in Android?
For example, why does Android need toybox expr? Does some shell
script use it? I only know a little about Android (from building AOSP
18 months ago), but my impression was that the use of shell scripts
was limited, and they were all "in-tree" so you can patch them.

Google isn't the only one shipping Android devices :-)

Post by Andy Chu
So if you had some upstream packages that used shell for an init
script or something, you could patch it to use $(()) in mksh instead
of expr, and avoid expr. Or maybe it's just not worth the effort to
do that, and it's easier to have expr on there. (I thought I was the
only one who used expr :) )

there are things that are worth fixing even if the world has to
change. but this isn't one of them. there's almost no cost to
including expr, POSIX says it should be there, and there's no obvious
advantage to us in not shipping it.

Post by Andy Chu
Who uses the shell? I would have guessed it's for the boot process
mainly? I don't think people are installing compilers on Android
phones and building GNU packages, but I could be mistaken. And I
would think the main use cases right now for toybox are on the device,
not for platform and NDK builds, no? But maybe it is useful for build
with toybox to make the build hermetic. I know all the compilers are
vendored in.

no, we don't use toybox for the host. its awkward configuration
process makes that pretty unlikely too. if/when toybox gets
gen*fs/mk*fs toys, it'll be worth looking into. but right now there's
no point.

Post by Andy Chu
I imagine the big users now are not end users or app developers, but
platform developers (e.g. Samsung engineers) who need to tweak things
when bringing up the OS on new hardware. I'm able to run toybox on my
Nexus 5x via ConnectBot, though honestly I'm not sure what to do with
it since it's limited to an individual app UID :)

so far i've only really worried about platform developers, yes. longer
term i plan on having a static toybox binary in the NDK too: for
non-platform developers, the version in your NDK will always be at
least as recent as the one on the device. so why suffer old bugs when
you can just push a known-good copy?

(see how many times "sed is broken on Android M" gets reported, for example.)

but, yeah, the shell is probably less interesting to app developers.
they're likely to be running "adb shell ..." from their desktop's
shell rather than using a shell on the device. (not least because "adb
shell" was basically unusable on Windows until recently.)

Post by Andy Chu
I think it would be cool to have my shell on Android, and it's written
in Google-style C++ like Ninja like I said, with unit tests, no
globals and (tasteful) dependency injection and everything. It's only
2 weeks of work and 2800 lines of code now, so that's a long ways off
:)
But I'm curious about the use cases. I think I can probably get some
other shell users at Google to contribute, because I've done a lot of
the hard parts in a clean way. Lexing and parsing shell are
particularly hard IMO; compare with mksh's yylex() and associated
macros in lex.c, which is about 1000 lines in a single function, and
it only does part of the job my lexer does.
thanks,
Andy

--
Elliott Hughes - http://who/enh - http://jessies.org/~enh/
Android native code/tools questions? Mail me/drop by/add me as a reviewer.

Andy Chu

2016-04-15 22:48:51 UTC

Post by enh

Google isn't the only one shipping Android devices :-)

OK I see... so if Samsung or whoever wants to put a new daemon on
Android, then it might use expr in its init script. And obviously
they have their own private forks which we can't see.

I know that builds and autoconf in particular make a lot of usage of
shell and expr, as Rob shows. But it seems to me that compiling
software on Android devices isn't currently a particularly big use
case (not to say that it shouldn't be).

Post by enh
there are things that are worth fixing even if the world has to
change. but this isn't one of them. there's almost no cost to
including expr, POSIX says it should be there, and there's no obvious
advantage to us in not shipping it.

OK but I was wondering why you pulled expr out of pending, rather than
using the defconfig build. expr was WILDLY broken -- it gave the
wrong answers to MATH, and it segfaulted on trivial inputs like
matching a number-like string with a regex.

I was worried that somebody would rely on the current broken expr
behavior, but as I understand it, every Android release is basically a
clean slate... if a current expr user relied on broken behavior in
Android M, then with Android N or whatever it would be up to them to
fix their shell script. Is that correct? It would likely be a shell
script in their private tree so that would have to be the case I
think.

Likewise, if you decided to change the shell, you wouldn't worry about
minor breakages? I think the shell has already changed once, right?
And it was probably mostly a superset but not strictly a superset.

Post by enh
so far i've only really worried about platform developers, yes. longer
term i plan on having a static toybox binary in the NDK too: for
non-platform developers, the version in your NDK will always be at
least as recent as the one on the device. so why suffer old bugs when
you can just push a known-good copy?
(see how many times "sed is broken on Android M" gets reported, for example.)

Can you explain this a little more? I would imagine that people use
sed on the host with the NDK, to build shared objects and so forth,
which they will then transfer over to the device with their app to
test. But is sed used at runtime on the Android device if you're
writing code with the NDK?

Or is it for having parity between a simulator on the host and the device?

Post by enh
but, yeah, the shell is probably less interesting to app developers.
they're likely to be running "adb shell ..." from their desktop's
shell rather than using a shell on the device. (not least because "adb
shell" was basically unusable on Windows until recently.)

OK, understood. Thanks for the explanation.

Andy

enh

2016-04-16 00:59:28 UTC

Post by enh

Google isn't the only one shipping Android devices :-)

OK I see... so if Samsung or whoever wants to put a new daemon on
Android, then it might use expr in its init script. And obviously
they have their own private forks which we can't see.
I know that builds and autoconf in particular make a lot of usage of
shell and expr, as Rob shows. But it seems to me that compiling
software on Android devices isn't currently a particularly big use
case (not to say that it shouldn't be).

there are certainly people who sit not too far from me who think
that's fun, but, no, that's not really what i'm aiming at.

as my "Future command-line work" slide at the last couple of Android
partner bootcamps has said:

* No one misses busybox/coreutils --- our command-line tools “just work”.

* All the useful developer commands, with all the useful options.

i love replying to "how do i do <shell one-liner> on the device?"
questions with "<same shell one-liner>" :-)

iirc, "because it's POSIX, and we hadn't had _any_ expr before, and
something is better than nothing, and you don't get bug reports if you
don't ship".

Post by Andy Chu
I was worried that somebody would rely on the current broken expr
behavior, but as I understand it, every Android release is basically a
clean slate... if a current expr user relied on broken behavior in
Android M, then with Android N or whatever it would be up to them to
fix their shell script. Is that correct? It would likely be a shell
script in their private tree so that would have to be the case I
think.

relying on bugs is [mostly] not supported. sometimes a bug is relied
upon by everyone, in which case it's no longer really a bug. but
usually we consider ourselves to be at liberty to fix any bug.

(and although most devices aren't built directly from our tree, the
only devices i'm required to keep building are the ones in our tree,
and -- as you said earlier -- if it comes to it i can fix them all
myself.)

Post by Andy Chu
Likewise, if you decided to change the shell, you wouldn't worry about
minor breakages? I think the shell has already changed once, right?
And it was probably mostly a superset but not strictly a superset.

the previous shell was so bad everyone cheered when it went away. the
same was true for some of the old toolbox stuff. but some of the
toolbox -> toybox transition was tricker. it's still not complete! the
main reason i haven't switched ps and top is that they don't support
threads, but most of the minor issues we've had to work through have
involved output formats. ls wasn't in M (but is in N) because i had to
fix a *lot* of code that was parsing the old (non-POSIX) output. in
other cases we've had to change toybox to have traditional Android
defaults when running on Android.

i've no idea what folks are using it for. maybe just shell one-liners,
maybe they have weird scripts. _someone_ probably is building python
on the device rather than cross-compiling it.

Post by Andy Chu
Or is it for having parity between a simulator on the host and the device?

OK, understood. Thanks for the explanation.
Andy

--
Elliott Hughes - http://who/enh - http://jessies.org/~enh/
Android native code/tools questions? Mail me/drop by/add me as a reviewer.

Andy Chu

2016-03-27 19:09:44 UTC

"ln -sf" actually works pretty predictably. :)

One more nitpick ... ln -sf isn't idempotent for the same reason that
cp isn't. If you run it more than once you get different results.

I was trying to create a package manager with sandboxed builds, with
the aforementioned linux-user-chroot, and my own tools. Bootstrapped
all in shell. All of my ln invocations used -T /
--no-target-directory for this reason. (At least this flag doesn't
seem to be entangled with other behavior, unlike mkdir and rm).

$ mkdir foo

$ ln -sf foo link; tree --charset=ascii
.
|-- foo
`-- link -> foo

$ ln -sf foo link; tree --charset=ascii
.
|-- foo
| `-- foo -> foo
`-- link -> foo

Compare with:

$ mkdir foo

$ ln -sf -T foo link; tree --charset=ascii
.
|-- foo
`-- link -> foo

$ ln -sf -T foo link; tree --charset=ascii
.
|-- foo
`-- link -> foo

Andy

Andy

Rob Landley

2016-03-28 06:59:50 UTC