Discussion:
[systemd-devel] A start job is running for /dev/gpt-auto-root
Arian van Putten
2021-04-24 15:03:31 UTC
Permalink
Dear list,

I've been working on trying to integrate systemd-gpt-auto and
systemd-cryptsetup into NixOS's stage-1 init.

Everything was working great on 246; but when I updated from kernel 5.4
to 5.10 and from systemd 246 to 247, the setup stopped working.

After entering my LUKS password, the boot keeps hanging with:

A start job is running for /dev/gpt-auto-root

and eventually fails with:

[ TIME ] Timed out waiting for device /dev/gpt-auto-root.

Interestingly. Both /dev/gpt-auto-root and /dev/gpt-auto-root-luks exist
in the /dev tree so the udev rules are all fired correctly; but for some
reason they are not propagating to the device unit.
Also the btrfs kernel module didn't seem to get automatically loaded for
some reason.

For the failed boot on 247; this seems to be the interesting bit of the log:
Apr 24 12:18:46 localhost systemd-udevd[209]: dm-0: Adding watch on
'/dev/dm-0'
Apr 24 12:18:46 localhost systemd-udevd[209]: dm-0: sd-device: Created db
file '/run/udev/data/b254:0' for '/devices/virtual/block/dm-0'
Apr 24 12:18:46 localhost systemd-udevd[209]: dm-0: Device (SEQNUM=1720,
ACTION=change) processed
Apr 24 12:18:46 localhost systemd-udevd[209]: dm-0: sd-device-monitor:
Passed 947 byte to netlink monitor
Apr 24 12:18:46 localhost systemd[1]: systemd-journald.service: Received
EPOLLHUP on stored fd 46 (stored), closing.
Apr 24 12:18:46 localhost systemd[1]: Received SIGCHLD from PID 192
(systemd-cryptse).
Apr 24 12:18:46 localhost systemd[1]: Child 192 (systemd-cryptse) died
(code=exited, status=0/SUCCESS)
Apr 24 12:18:46 localhost systemd[1]: systemd-***@root.service:
Child 192 belongs to systemd-***@root.service.
Apr 24 12:18:46 localhost systemd[1]: systemd-***@root.service: Main
process exited, code=exited, status=0/SUCCESS
Apr 24 12:18:46 localhost systemd[1]: systemd-***@root.service:
Changed start -> exited
Apr 24 12:18:46 localhost systemd[1]: systemd-***@root.service: Job
59 systemd-***@root.service/start finished, result=done
Apr 24 12:18:46 localhost systemd[1]: Finished Cryptography Setup for root.
Apr 24 12:18:46 localhost audit[1]: SERVICE_START pid=1 uid=0
auid=4294967295 ses=4294967295 subj=kernel msg='unit=systemd-***@root
comm="systemd"
exe="/nix/store/ixpwj1cxl4rp2qbyn0s2h4k5zj731q7c-systemd-247.6/lib/systemd/systemd"
hostname=? addr=? terminal=? res=success'
Apr 24 12:18:46 localhost systemd[1]: systemd-***@root.service:
Control group is empty.
Apr 24 12:18:46 localhost systemd[1]: ***@dev-mapper-root.target
changed dead -> active
Apr 24 12:18:46 localhost systemd[1]: ***@dev-mapper-root.target: Job
72 ***@dev-mapper-root.target/start finished, result=done
Apr 24 12:18:46 localhost systemd[1]: Reached target Block Device
Preparation for /dev/mapper/root.
Apr 24 12:18:46 localhost kernel: kauditd_printk_skb: 4 callbacks suppressed
Apr 24 12:18:46 localhost kernel: audit: type=1130
audit(1619266726.241:15): pid=1 uid=0 auid=4294967295 ses=4294967295
subj=kernel msg='unit=systemd-***@root comm="systemd"
exe="/nix/store/ixpwj1cxl4rp2qbyn0s2h4k5zj731q7c-systemd-247.6/lib/systemd/systemd"
hostname=? addr=? terminal=? res=success'
Apr 24 12:18:50 localhost systemd-udevd[148]: Cleanup idle workers
Apr 24 12:18:50 localhost systemd-udevd[207]: Unload module index
Apr 24 12:18:50 localhost systemd-udevd[209]: Unload module index
Apr 24 12:18:50 localhost systemd-udevd[205]: Unload module index
Apr 24 12:18:50 localhost systemd-udevd[206]: Unload module index
Apr 24 12:18:50 localhost systemd-udevd[207]: Unloaded link configuration
context.
Apr 24 12:18:50 localhost systemd-udevd[209]: Unloaded link configuration
context.
Apr 24 12:18:50 localhost systemd-udevd[205]: Unloaded link configuration
context.
Apr 24 12:18:50 localhost systemd-udevd[206]: Unloaded link configuration
context.
Apr 24 12:18:50 localhost systemd-udevd[148]: Worker [205] exited
Apr 24 12:18:50 localhost systemd-udevd[148]: Worker [206] exited
Apr 24 12:18:50 localhost systemd-udevd[148]: Worker [207] exited
Apr 24 12:18:50 localhost systemd-udevd[148]: Worker [209] exited
Apr 24 12:19:48 localhost systemd[1]: dev-gpt\x2dauto\x2droot.device: Job
dev-gpt\x2dauto\x2droot.device/start timed out.
Apr 24 12:19:48 localhost systemd[1]: dev-gpt\x2dauto\x2droot.device: Job
56 dev-gpt\x2dauto\x2droot.device/start finished, result=timeout
Apr 24 12:19:48 localhost systemd[1]: Timed out waiting for device
/dev/gpt-auto-root.
Apr 24 12:19:48 localhost systemd[1]: initrd-root-device.target: Job 57
initrd-root-device.target/start finished, result=dependency
Apr 24 12:19:48 localhost systemd[1]: Dependency failed for Initrd Root
Device.
Apr 24 12:19:48 localhost systemd[1]: initrd-root-device.target: Job
initrd-root-device.target/start failed with result 'dependency'.

Whilst the successful boot on 246 looks like this:

r 24 12:44:32 localhost systemd-cryptsetup[176]: root: retaining kernel
read ahead of 256 (requested 256)
Apr 24 12:44:32 localhost systemd-cryptsetup[176]: Releasing crypt device
/dev/gpt-auto-root-luks context.
Apr 24 12:44:32 localhost systemd-cryptsetup[176]: Releasing device-mapper
backend.
Apr 24 12:44:32 localhost systemd-udevd[193]: dm-0: Adding watch on
'/dev/dm-0'
Apr 24 12:44:32 localhost systemd-cryptsetup[176]: Closing read only fd for
/dev/gpt-auto-root-luks.
Apr 24 12:44:32 localhost systemd-udevd[193]: dm-0: sd-device: Created db
file '/run/udev/data/b254:0' for '/devices/virtual/block/dm-0'
Apr 24 12:44:32 localhost systemd-udevd[193]: dm-0: Device (SEQNUM=2066,
ACTION=change) processed
Apr 24 12:44:32 localhost systemd-udevd[193]: dm-0: sd-device-monitor:
Passed 891 byte to netlink monitor
Apr 24 12:44:32 localhost systemd[1]: dev-gpt\x2dauto\x2droot.device:
Changed dead -> plugged
Apr 24 12:44:32 localhost systemd[1]: dev-gpt\x2dauto\x2droot.device: Job
52 dev-gpt\x2dauto\x2droot.device/start finished, result=done
Apr 24 12:44:32 localhost systemd[1]: Found device /dev/gpt-auto-root.
Apr 24 12:44:32 localhost systemd[1]:
dev-disk-by\x2did-dm\x2dname\x2droot.device: Changed dead -> plugged
Apr 24 12:44:32 localhost systemd[1]:
dev-disk-by\x2did-dm\x2duuid\x2dCRYPT\x2dLUKS1\x2d0c642ebc2b7643dcb9ba34f1125d7f16\x2droot.device:
Changed dead -> plugged
Apr 24 12:44:32 localhost systemd[1]: dev-mapper-root.device: Changed dead
-> plugged
Apr 24 12:44:32 localhost systemd[1]:
dev-disk-by\x2duuid-5cb38441\x2d7012\x2d4aaf\x2da70c\x2db11aaaa212d3.device:
Changed dead -> plugged
Apr 24 12:44:32 localhost systemd[1]: dev-dm\x2d0.device: Changed dead ->
plugged
Apr 24 12:44:32 localhost systemd[1]:
sys-devices-virtual-block-dm\x2d0.device: Changed dead -> plugged
Apr 24 12:44:32 localhost systemd[1]: Received SIGCHLD from PID 176
(systemd-cryptse).
Apr 24 12:44:32 localhost systemd[1]: Child 176 (systemd-cryptse) died
(code=exited, status=0/SUCCESS)
Apr 24 12:44:32 localhost systemd[1]: systemd-***@root.service:
Failed to read oom_kill field of memory.events cgroup attribute: No such
file or directory
Apr 24 12:44:32 localhost systemd[1]: systemd-***@root.service:
Child 176 belongs to systemd-***@root.service.
Apr 24 12:44:32 localhost systemd[1]: systemd-***@root.service: Main
process exited, code=exited, status=0/SUCCESS
Apr 24 12:44:32 localhost systemd[1]: systemd-***@root.service:
Changed start -> exited
Apr 24 12:44:32 localhost systemd[1]: systemd-***@root.service: Job
56 systemd-***@root.service/start finished, result=done
Apr 24 12:44:32 localhost systemd[1]: Finished Cryptography Setup for root.
Apr 24 12:44:32 localhost audit[1]: SERVICE_START pid=1 uid=0
auid=4294967295 ses=4294967295 subj=kernel msg='unit=systemd-***@root
comm="systemd"
exe="/nix/store/zpzn7c5g58srji21flwqmxzbnaa8w29j-systemd-246.6/lib/systemd/systemd"
hostname=? addr=? terminal=? res=success'
Apr 24 12:44:32 localhost systemd[1]: systemd-***@root.service:
Control group is empty.
Apr 24 12:44:32 localhost kernel: kauditd_printk_skb: 1 callbacks suppressed
Apr 24 12:44:32 localhost kernel: audit: type=1130
audit(1619268272.815:12): pid=1 uid=0 auid=4294967295 ses=4294967295
subj=kernel msg='unit=systemd-***@root comm="systemd"
exe="/nix/store/zpzn7c5g58srji21flwqmxzbnaa8w29j-systemd-246.6/lib/systemd/systemd"
hostname=? addr=? terminal=? res=success'
Apr 24 12:44:32 localhost systemd[1]: systemd-journald.service: Received
EPOLLHUP on stored fd 43 (stored), closing.
Apr 24 12:44:32 localhost systemd[1]: ***@dev-mapper-root.target
changed dead -> active
Apr 24 12:44:32 localhost systemd[1]: ***@dev-mapper-root.target: Job
69 ***@dev-mapper-root.target/start finished, result=done
Apr 24 12:44:32 localhost systemd[1]: Reached target Block Device
Preparation for /dev/mapper/root.
Apr 24 12:44:32 localhost systemd[1]: initrd-root-device.target:
ConditionPathExists=/etc/initrd-release succeeded.
Apr 24 12:44:32 localhost systemd[1]: initrd-root-device.target changed
dead -> active
Apr 24 12:44:32 localhost systemd[1]: initrd-root-device.target: Job 55
initrd-root-device.target/start finished, result=done
Apr 24 12:44:32 localhost systemd[1]: Reached target Initrd Root Device.
Apr 24 12:44:32 localhost systemd[1]: initrd.target: starting held back,
waiting for: initrd-root-fs.target
Apr 24 12:44:32 localhost systemd[1]: Failed to read pids.max attribute of
cgroup root, ignoring: No data available
Apr 24 12:44:32 localhost systemd[1]: sysroot.mount: About to execute:
/nix/store/wcbrp1x2xfmqdpg4vs759irmm1k74c0s-util-linux-2.36-bin/bin/mount
/dev/gpt-auto-root /sysroot -o ro
Apr 24 12:44:32 localhost systemd[1]: sysroot.mount: Forked
/nix/store/wcbrp1x2xfmqdpg4vs759irmm1k74c0s-util-linux-2.36-bin/bin/mount
as 202
Apr 24 12:44:32 localhost systemd[1]: sysroot.mount: Changed dead ->
mounting
Apr 24 12:44:32 localhost systemd[1]: Mounting Root Partition...
Apr 24 12:44:32 localhost systemd[202]: sysroot.mount: Executing:
/nix/store/wcbrp1x2xfmqdpg4vs759irmm1k74c0s-util-linux-2.36-bin/bin/mount
/dev/gpt-auto-root /sysroot -o ro


I have attached the full boot log of the initrd as an attachment.
initrd-246.log succeeds; whilst initrd-247.log fails

Would love to get some pointers on how to further debug this. I can't tell
from the logs what exactly is going wrong.
Arian van Putten
2021-04-27 16:21:51 UTC
Permalink
Hi all,

In case anybody is interested.
After some heavy debugging I was able to bisect the issue to the following
pull request (which was backported to v247)
https://github.com/systemd/systemd/pull/18802#issuecomment-827707662 . I
left comments there on what broke.

Thanks for the help either way!
Post by Arian van Putten
Dear list,
I've been working on trying to integrate systemd-gpt-auto and
systemd-cryptsetup into NixOS's stage-1 init.
Everything was working great on 246; but when I updated from kernel 5.4
to 5.10 and from systemd 246 to 247, the setup stopped working.
A start job is running for /dev/gpt-auto-root
[ TIME ] Timed out waiting for device /dev/gpt-auto-root.
Interestingly. Both /dev/gpt-auto-root and /dev/gpt-auto-root-luks
exist in the /dev tree so the udev rules are all fired correctly; but for
some reason they are not propagating to the device unit.
Also the btrfs kernel module didn't seem to get automatically loaded for
some reason.
Apr 24 12:18:46 localhost systemd-udevd[209]: dm-0: Adding watch on
'/dev/dm-0'
Apr 24 12:18:46 localhost systemd-udevd[209]: dm-0: sd-device: Created db
file '/run/udev/data/b254:0' for '/devices/virtual/block/dm-0'
Apr 24 12:18:46 localhost systemd-udevd[209]: dm-0: Device (SEQNUM=1720,
ACTION=change) processed
Passed 947 byte to netlink monitor
Apr 24 12:18:46 localhost systemd[1]: systemd-journald.service: Received
EPOLLHUP on stored fd 46 (stored), closing.
Apr 24 12:18:46 localhost systemd[1]: Received SIGCHLD from PID 192
(systemd-cryptse).
Apr 24 12:18:46 localhost systemd[1]: Child 192 (systemd-cryptse) died
(code=exited, status=0/SUCCESS)
Main process exited, code=exited, status=0/SUCCESS
Changed start -> exited
Apr 24 12:18:46 localhost systemd[1]: Finished Cryptography Setup for root.
Apr 24 12:18:46 localhost audit[1]: SERVICE_START pid=1 uid=0
comm="systemd"
exe="/nix/store/ixpwj1cxl4rp2qbyn0s2h4k5zj731q7c-systemd-247.6/lib/systemd/systemd"
hostname=? addr=? terminal=? res=success'
Control group is empty.
changed dead -> active
Apr 24 12:18:46 localhost systemd[1]: Reached target Block Device
Preparation for /dev/mapper/root.
Apr 24 12:18:46 localhost kernel: kauditd_printk_skb: 4 callbacks suppressed
Apr 24 12:18:46 localhost kernel: audit: type=1130
audit(1619266726.241:15): pid=1 uid=0 auid=4294967295 ses=4294967295
exe="/nix/store/ixpwj1cxl4rp2qbyn0s2h4k5zj731q7c-systemd-247.6/lib/systemd/systemd"
hostname=? addr=? terminal=? res=success'
Apr 24 12:18:50 localhost systemd-udevd[148]: Cleanup idle workers
Apr 24 12:18:50 localhost systemd-udevd[207]: Unload module index
Apr 24 12:18:50 localhost systemd-udevd[209]: Unload module index
Apr 24 12:18:50 localhost systemd-udevd[205]: Unload module index
Apr 24 12:18:50 localhost systemd-udevd[206]: Unload module index
Apr 24 12:18:50 localhost systemd-udevd[207]: Unloaded link configuration
context.
Apr 24 12:18:50 localhost systemd-udevd[209]: Unloaded link configuration
context.
Apr 24 12:18:50 localhost systemd-udevd[205]: Unloaded link configuration
context.
Apr 24 12:18:50 localhost systemd-udevd[206]: Unloaded link configuration
context.
Apr 24 12:18:50 localhost systemd-udevd[148]: Worker [205] exited
Apr 24 12:18:50 localhost systemd-udevd[148]: Worker [206] exited
Apr 24 12:18:50 localhost systemd-udevd[148]: Worker [207] exited
Apr 24 12:18:50 localhost systemd-udevd[148]: Worker [209] exited
Apr 24 12:19:48 localhost systemd[1]: dev-gpt\x2dauto\x2droot.device: Job
dev-gpt\x2dauto\x2droot.device/start timed out.
Apr 24 12:19:48 localhost systemd[1]: dev-gpt\x2dauto\x2droot.device: Job
56 dev-gpt\x2dauto\x2droot.device/start finished, result=timeout
Apr 24 12:19:48 localhost systemd[1]: Timed out waiting for device
/dev/gpt-auto-root.
Apr 24 12:19:48 localhost systemd[1]: initrd-root-device.target: Job 57
initrd-root-device.target/start finished, result=dependency
Apr 24 12:19:48 localhost systemd[1]: Dependency failed for Initrd Root
Device.
Apr 24 12:19:48 localhost systemd[1]: initrd-root-device.target: Job
initrd-root-device.target/start failed with result 'dependency'.
r 24 12:44:32 localhost systemd-cryptsetup[176]: root: retaining kernel
read ahead of 256 (requested 256)
Apr 24 12:44:32 localhost systemd-cryptsetup[176]: Releasing crypt device
/dev/gpt-auto-root-luks context.
Apr 24 12:44:32 localhost systemd-cryptsetup[176]: Releasing device-mapper
backend.
Apr 24 12:44:32 localhost systemd-udevd[193]: dm-0: Adding watch on
'/dev/dm-0'
Apr 24 12:44:32 localhost systemd-cryptsetup[176]: Closing read only fd
for /dev/gpt-auto-root-luks.
Apr 24 12:44:32 localhost systemd-udevd[193]: dm-0: sd-device: Created db
file '/run/udev/data/b254:0' for '/devices/virtual/block/dm-0'
Apr 24 12:44:32 localhost systemd-udevd[193]: dm-0: Device (SEQNUM=2066,
ACTION=change) processed
Passed 891 byte to netlink monitor
Changed dead -> plugged
Apr 24 12:44:32 localhost systemd[1]: dev-gpt\x2dauto\x2droot.device: Job
52 dev-gpt\x2dauto\x2droot.device/start finished, result=done
Apr 24 12:44:32 localhost systemd[1]: Found device /dev/gpt-auto-root.
dev-disk-by\x2did-dm\x2dname\x2droot.device: Changed dead -> plugged
Changed dead -> plugged
Apr 24 12:44:32 localhost systemd[1]: dev-mapper-root.device: Changed dead
-> plugged
Changed dead -> plugged
Apr 24 12:44:32 localhost systemd[1]: dev-dm\x2d0.device: Changed dead ->
plugged
sys-devices-virtual-block-dm\x2d0.device: Changed dead -> plugged
Apr 24 12:44:32 localhost systemd[1]: Received SIGCHLD from PID 176
(systemd-cryptse).
Apr 24 12:44:32 localhost systemd[1]: Child 176 (systemd-cryptse) died
(code=exited, status=0/SUCCESS)
Failed to read oom_kill field of memory.events cgroup attribute: No such
file or directory
Main process exited, code=exited, status=0/SUCCESS
Changed start -> exited
Apr 24 12:44:32 localhost systemd[1]: Finished Cryptography Setup for root.
Apr 24 12:44:32 localhost audit[1]: SERVICE_START pid=1 uid=0
comm="systemd"
exe="/nix/store/zpzn7c5g58srji21flwqmxzbnaa8w29j-systemd-246.6/lib/systemd/systemd"
hostname=? addr=? terminal=? res=success'
Control group is empty.
Apr 24 12:44:32 localhost kernel: kauditd_printk_skb: 1 callbacks suppressed
Apr 24 12:44:32 localhost kernel: audit: type=1130
audit(1619268272.815:12): pid=1 uid=0 auid=4294967295 ses=4294967295
exe="/nix/store/zpzn7c5g58srji21flwqmxzbnaa8w29j-systemd-246.6/lib/systemd/systemd"
hostname=? addr=? terminal=? res=success'
Apr 24 12:44:32 localhost systemd[1]: systemd-journald.service: Received
EPOLLHUP on stored fd 43 (stored), closing.
changed dead -> active
Apr 24 12:44:32 localhost systemd[1]: Reached target Block Device
Preparation for /dev/mapper/root.
ConditionPathExists=/etc/initrd-release succeeded.
Apr 24 12:44:32 localhost systemd[1]: initrd-root-device.target changed
dead -> active
Apr 24 12:44:32 localhost systemd[1]: initrd-root-device.target: Job 55
initrd-root-device.target/start finished, result=done
Apr 24 12:44:32 localhost systemd[1]: Reached target Initrd Root Device.
Apr 24 12:44:32 localhost systemd[1]: initrd.target: starting held back,
waiting for: initrd-root-fs.target
Apr 24 12:44:32 localhost systemd[1]: Failed to read pids.max attribute of
cgroup root, ignoring: No data available
/nix/store/wcbrp1x2xfmqdpg4vs759irmm1k74c0s-util-linux-2.36-bin/bin/mount
/dev/gpt-auto-root /sysroot -o ro
Apr 24 12:44:32 localhost systemd[1]: sysroot.mount: Forked
/nix/store/wcbrp1x2xfmqdpg4vs759irmm1k74c0s-util-linux-2.36-bin/bin/mount
as 202
Apr 24 12:44:32 localhost systemd[1]: sysroot.mount: Changed dead ->
mounting
Apr 24 12:44:32 localhost systemd[1]: Mounting Root Partition...
/nix/store/wcbrp1x2xfmqdpg4vs759irmm1k74c0s-util-linux-2.36-bin/bin/mount
/dev/gpt-auto-root /sysroot -o ro
I have attached the full boot log of the initrd as an attachment.
initrd-246.log succeeds; whilst initrd-247.log fails
Would love to get some pointers on how to further debug this. I can't tell
from the logs what exactly is going wrong.
Loading...