Discussion:
Bug#943327: mmdebstrap: Please support using pixz
(too old to reply)
Benjamin Drung
2019-10-23 12:20:01 UTC
Permalink
Package: mmdebstrap
Version: 0.5.1-1
Severity: wishlist
Tags: upstream

Hi,

one of mmdebstrap benefits over deboostrap is that it is faster.
Creating a xz tarball as output will take a lot of time, since xz
consumes a lot of compute power and tar uses only one core.

pixz is a pallalel version of xz that can speedup the compression a lot.
It can be simply used by tar by specifying -Ipixz.

So please support using pixz when creating xz tarballs, maybe even as
default if pixz can be found.
--
Benjamin Drung

Debian & Ubuntu Developer
Platform Engineering Compute (Enterprise Cloud)

1&1 IONOS SE | Greifswalder Str. 207 | 10405 Berlin | Germany
E-mail: ***@cloud.ionos.com | Web: www.ionos.de

Hauptsitz Montabaur, Amtsgericht Montabaur, HRB 24498
Vorstand: Dr. Christian Böing, Hüseyin Dogan, Hans-Henning Kettler,
Matthias Steinberg, Achim Weiß
Aufsichtsratsvorsitzender: Markus Kadelke
Me
Johannes Schauer
2019-10-23 13:10:02 UTC
Permalink
Hi,

Quoting Benjamin Drung (2019-10-23 14:14:15)
one of mmdebstrap benefits over deboostrap is that it is faster. Creating a
xz tarball as output will take a lot of time, since xz consumes a lot of
compute power and tar uses only one core.
pixz is a pallalel version of xz that can speedup the compression a lot.
It can be simply used by tar by specifying -Ipixz.
So please support using pixz when creating xz tarballs, maybe even as default
if pixz can be found.
I never heard of pixz. Is it still actively developed? I see that the last
release is already four years old and in Debian we have the same version since
oldstable. I just don't want to add code to support unmaintained or rarely used
software.

It also seems that since pixz actually *changes* the data it is supposed to
compress in case it receives a tarball, it came up with a new file ending
*.tpxz.

Currently, mmdebstrap supports all compression file endings supported by tar
and tar currently does not know about pixz. Is there a reason why tar
maintainers thought they don't add that compressor to their list?

Thanks!

cheers, josch
Benjamin Drung
2019-10-23 14:00:01 UTC
Permalink
Post by Johannes Schauer
Hi,
Quoting Benjamin Drung (2019-10-23 14:14:15)
Post by Benjamin Drung
one of mmdebstrap benefits over deboostrap is that it is
faster. Creating a
xz tarball as output will take a lot of time, since xz consumes a lot of
compute power and tar uses only one core.
pixz is a pallalel version of xz that can speedup the compression a lot.
It can be simply used by tar by specifying -Ipixz.
So please support using pixz when creating xz tarballs, maybe even as default
if pixz can be found.
I never heard of pixz. Is it still actively developed? I see that the last
release is already four years old and in Debian we have the same version since
oldstable. I just don't want to add code to support unmaintained or rarely used
software.
I wasn't aware that pixz hasn't have a lot of development recently, but
I do not see any signs that it is dead. It look like it is being
feature complete and under maintenance now.
Post by Johannes Schauer
Currently, mmdebstrap supports all compression file endings supported by tar
and tar currently does not know about pixz. Is there a reason why tar
maintainers thought they don't add that compressor to their list?
This question triggered me to search the web and I found out, that xz
gained support for parallel compression files (only compressing, but
not decompressing):

$ time tar -J -cf root1.tar.xz -C root .
real 5m19,349s
user 5m19,775s
sys 0m4,185s

$ time tar -Ipixz -cf root2.tar.xz -C root .
real 0m59,250s
user 10m23,438s
sys 0m3,734s

$ time tar -I"xz -T 0" -cf root3.tar.xz -C root .
real 1m0,764s
user 10m43,783s
sys 0m3,892s

So it would be nice if mmdebstrap would use -I"xz -T 0" for compressing
xz in parallel. This does not require addition dependencies!
--
Benjamin Drung

Debian & Ubuntu Developer
Platform Engineering Compute (Enterprise Cloud)

1&1 IONOS SE | Greifswalder Str. 207 | 10405 Berlin | Germany
E-mail: ***@cloud.ionos.com | Web: www.ionos.de

Hauptsitz Montabaur, Amtsgericht Montabaur, HRB 24498
Vorstand: Dr. Christian Böing, Hüseyin Dogan, Hans-Henning Kettler,
Matthias Steinberg, Achim Weiß
Aufsichtsratsvorsitzender: Markus Kadelke
Member of United Internet
Benjamin Drung
2019-11-12 18:30:02 UTC
Permalink
tags 943327 patch
thanks

On Wed, 23 Oct 2019 15:52:26 +0200 Benjamin Drung <
Post by Benjamin Drung
Post by Johannes Schauer
Hi,
Quoting Benjamin Drung (2019-10-23 14:14:15)
Post by Benjamin Drung
one of mmdebstrap benefits over deboostrap is that it is
faster. Creating a
xz tarball as output will take a lot of time, since xz consumes a lot of
compute power and tar uses only one core.
pixz is a pallalel version of xz that can speedup the compression a lot.
It can be simply used by tar by specifying -Ipixz.
So please support using pixz when creating xz tarballs, maybe even as default
if pixz can be found.
I never heard of pixz. Is it still actively developed? I see that the last
release is already four years old and in Debian we have the same version since
oldstable. I just don't want to add code to support unmaintained or rarely used
software.
I wasn't aware that pixz hasn't have a lot of development recently, but
I do not see any signs that it is dead. It look like it is being
feature complete and under maintenance now.
Post by Johannes Schauer
Currently, mmdebstrap supports all compression file endings supported by tar
and tar currently does not know about pixz. Is there a reason why tar
maintainers thought they don't add that compressor to their list?
This question triggered me to search the web and I found out, that xz
gained support for parallel compression files (only compressing, but
$ time tar -J -cf root1.tar.xz -C root .
real 5m19,349s
user 5m19,775s
sys 0m4,185s
$ time tar -Ipixz -cf root2.tar.xz -C root .
real 0m59,250s
user 10m23,438s
sys 0m3,734s
$ time tar -I"xz -T 0" -cf root3.tar.xz -C root .
real 1m0,764s
user 10m43,783s
sys 0m3,892s
So it would be nice if mmdebstrap would use -I"xz -T 0" for compressing
xz in parallel. This does not require addition dependencies!
Attached a simple patch that adds the -I"xz -T 0" option when using xz
as compression.
--
Benjamin Drung

Debian & Ubuntu Developer
Platform Engineering Compute (Enterprise Cloud)

1&1 IONOS SE | Greifswalder Str. 207 | 10405 Berlin | Germany
E-mail: ***@cloud.ionos.com | Web: www.ionos.de

Hauptsitz Montabaur, Amtsgericht Montabaur, HRB 24498
Vorstand: Dr. Christian Böing, HÌseyin Dogan, Hans-Henning Kettler,
Matthias Steinberg, Achim Weiß
Aufsichtsratsvorsitzender: Markus Kadelke
Member of United Internet
Johannes Schauer
2019-11-13 11:10:02 UTC
Permalink
Control: tag -1 + pending

Hi,

with some minor changes committed here:

https://gitlab.mister-muffin.de/josch/mmdebstrap/commit/4b82a664daa5b2430a00b737706ee77c75288158

Thanks!

cheers, josch
Benjamin Drung
2019-11-13 16:20:01 UTC
Permalink
Post by Johannes Schauer
Control: tag -1 + pending
Hi,
https://gitlab.mister-muffin.de/josch/mmdebstrap/commit/4b82a664daa5b2430a00b737706ee77c75288158
Sadly, this change breaks mmdebstrap:

$ LANG=C tar -tf root.tar.xz
./dev/
./dev/console
./dev/fd
./dev/full
./dev/null
./dev/ptmx
./dev/pts/
./dev/random
./dev/shm/
./dev/stderr
./dev/stdin
./dev/stdout
./dev/tty
./dev/urandom
./dev/zero
tar: Skipping to next header
tar: Exiting with failure status due to previous errors

When I developed the patch, I just checked that the tarball was created
and the file size matches, but I didn't check the content.
--
Benjamin Drung

Debian & Ubuntu Developer
Platform Engineering Compute (Enterprise Cloud)

1&1 IONOS SE | Greifswalder Str. 207 | 10405 Berlin | Germany
E-mail: ***@cloud.ionos.com | Web: www.ionos.de

Hauptsitz Montabaur, Amtsgericht Montabaur, HRB 24498
Vorstand: Dr. Christian Böing, HÌseyin Dogan, Hans-Henning Kettler,
Matthias Steinberg, Achim Weiß
Aufsichtsratsvorsitzender: Markus Kadelke
Member of United Internet
Johannes Schauer
2019-11-13 17:00:02 UTC
Permalink
Quoting Benjamin Drung (2019-11-13 17:08:46)
Post by Benjamin Drung
Post by Johannes Schauer
https://gitlab.mister-muffin.de/josch/mmdebstrap/commit/4b82a664daa5b2430a00b737706ee77c75288158
$ LANG=C tar -tf root.tar.xz
./dev/
./dev/console
./dev/fd
./dev/full
./dev/null
./dev/ptmx
./dev/pts/
./dev/random
./dev/shm/
./dev/stderr
./dev/stdin
./dev/stdout
./dev/tty
./dev/urandom
./dev/zero
tar: Skipping to next header
tar: Exiting with failure status due to previous errors
When I developed the patch, I just checked that the tarball was created and
the file size matches, but I didn't check the content.
ah indeed. This is of course because mmdebstrap assembles the tarball from two
parts and then runs the compressor outside of tar. A correct patch probably
look more like this:

@@ -161,7 +161,7 @@ sub get_tar_compressor($) {
} elsif ($filename =~ /\.lz4$/) {
return 'lz4';
} elsif ($filename =~ /\.(xz|txz)$/) {
- return 'xz';
+ return ('xz', '--threads=0');
} elsif ($filename =~ /\.zst$/) {
return 'zstd';
}
Benjamin Drung
2019-11-13 17:40:02 UTC
Permalink
Post by Johannes Schauer
Quoting Benjamin Drung (2019-11-13 17:08:46)
Post by Benjamin Drung
Post by Johannes Schauer
https://gitlab.mister-muffin.de/josch/mmdebstrap/commit/4b82a664daa5b2430a00b737706ee77c75288158
$ LANG=C tar -tf root.tar.xz
./dev/
./dev/console
./dev/fd
./dev/full
./dev/null
./dev/ptmx
./dev/pts/
./dev/random
./dev/shm/
./dev/stderr
./dev/stdin
./dev/stdout
./dev/tty
./dev/urandom
./dev/zero
tar: Skipping to next header
tar: Exiting with failure status due to previous errors
When I developed the patch, I just checked that the tarball was created and
the file size matches, but I didn't check the content.
ah indeed. This is of course because mmdebstrap assembles the tarball from two
parts and then runs the compressor outside of tar. A correct patch probably
@@ -161,7 +161,7 @@ sub get_tar_compressor($) {
} elsif ($filename =~ /\.lz4$/) {
return 'lz4';
} elsif ($filename =~ /\.(xz|txz)$/) {
- return 'xz';
+ return ('xz', '--threads=0');
} elsif ($filename =~ /\.zst$/) {
return 'zstd';
}
I have tested this proposed change by dirty patching the two exec
lines:

exec ($tar_compressor, '--threads=0') or error "[...]";

It works and creates a tarball. The generated tarball is actually
working (verified by using it).
--
Benjamin Drung

Debian & Ubuntu Developer
Platform Engineering Compute (Enterprise Cloud)

1&1 IONOS SE | Greifswalder Str. 207 | 10405 Berlin | Germany
E-mail: ***@cloud.ionos.com | Web: www.ionos.de

Hauptsitz Montabaur, Amtsgericht Montabaur, HRB 24498
Vorstand: Dr. Christian Böing, HÌseyin Dogan, Hans-Henning Kettler,
Matthias Steinberg, Achim Weiß
Aufsichtsratsvorsitzender: Markus Kadelke
Member of United Internet
Johannes Schauer
2019-11-29 08:00:03 UTC
Permalink
Post by Benjamin Drung
Post by Johannes Schauer
Post by Benjamin Drung
When I developed the patch, I just checked that the tarball was created
and the file size matches, but I didn't check the content.
ah indeed. This is of course because mmdebstrap assembles the tarball from two
parts and then runs the compressor outside of tar. A correct patch probably
@@ -161,7 +161,7 @@ sub get_tar_compressor($) {
} elsif ($filename =~ /\.lz4$/) {
return 'lz4';
} elsif ($filename =~ /\.(xz|txz)$/) {
- return 'xz';
+ return ('xz', '--threads=0');
} elsif ($filename =~ /\.zst$/) {
return 'zstd';
}
I have tested this proposed change by dirty patching the two exec
exec ($tar_compressor, '--threads=0') or error "[...]";
It works and creates a tarball. The generated tarball is actually working
(verified by using it).
fixed in git:

https://gitlab.mister-muffin.de/josch/mmdebstrap/commit/9f2ea61265c36945b1fbbc27fd70099e58df794d
Benjamin Drung
2019-12-02 14:10:02 UTC
Permalink
On Wed, 13 Nov 2019 18:36:43 +0100 Benjamin Drung <
Post by Benjamin Drung
Post by Johannes Schauer
Post by Benjamin Drung
When I developed the patch, I just checked that the tarball was created
and the file size matches, but I didn't check the content.
ah indeed. This is of course because mmdebstrap assembles the
tarball
from two
parts and then runs the compressor outside of tar. A correct
patch
probably
@@ -161,7 +161,7 @@ sub get_tar_compressor($) {
} elsif ($filename =~ /\.lz4$/) {
return 'lz4';
} elsif ($filename =~ /\.(xz|txz)$/) {
- return 'xz';
+ return ('xz', '--threads=0');
} elsif ($filename =~ /\.zst$/) {
return 'zstd';
}
I have tested this proposed change by dirty patching the two exec
exec ($tar_compressor, '--threads=0') or error "[...]";
It works and creates a tarball. The generated tarball is actually working
(verified by using it).
https://gitlab.mister-muffin.de/josch/mmdebstrap/commit/9f2ea61265c36945b1fbbc27fd70099e58df794d
That commit does not work for me:

$ ./mmdebstrap -v buster buster.tar.xz
I: automatically chosen mode: unshare
I: chroot architecture amd64 is equal to the host's architecture
Can't exec "--threads=0": No such file or directory at ./mmdebstrap line 2303.
E: cannot exec --threads=0: No such file or directory
E: failed to start --threads=0
--
Benjamin Drung

Debian & Ubuntu Developer
Platform Engineering Compute (Enterprise Cloud)

1&1 IONOS SE | Greifswalder Str. 207 | 10405 Berlin | Germany
E-mail: ***@cloud.ionos.com | Web: www.ionos.de

Hauptsitz Montabaur, Amtsgericht Montabaur, HRB 24498
Vorstand: Dr. Christian Böing, HÌseyin Dogan, Hans-Henning Kettler,
Matthias Steinberg, Achim Weiß
Aufsichtsratsvorsitzender: Markus Kadelke
Member of United Internet
Johannes Schauer
2019-12-02 20:30:03 UTC
Permalink
Hi,

Quoting Benjamin Drung (2019-12-02 14:59:28)
Post by Benjamin Drung
Post by Johannes Schauer
https://gitlab.mister-muffin.de/josch/mmdebstrap/commit/9f2ea61265c36945b1fbbc27fd70099e58df794d
$ ./mmdebstrap -v buster buster.tar.xz
I: automatically chosen mode: unshare
I: chroot architecture amd64 is equal to the host's architecture
Can't exec "--threads=0": No such file or directory at ./mmdebstrap line 2303.
E: cannot exec --threads=0: No such file or directory
E: failed to start --threads=0
wow, the one time that I don't write a test case for a new feature and of
course it keeps failing. I wonder what went wrong. Probably I was accidentally
running my system's installed mmdebstrap instead of the git version when
testing my changes.

I now added a test case to make sure that this cannot break again.

https://gitlab.mister-muffin.de/josch/mmdebstrap/commit/d262d678775ec6b270bcffaef4c5d4835ea1cd20

Thanks a lot for testing it!

cheers, josch

Loading...