Discussion:
[systemd-devel] Improving module loading
Umut Tezduyar Lindskog
2014-12-16 15:54:24 UTC
Permalink
Hi,

Is there a reason why systemd-modules-load is loading modules
sequentially? Few things can happen simultaneously like resolving the
symbols etc. Seems like modules_mutex is common on module loads which
gets locked up on few occasions throughout the execution of
sys_init_module.

The other thought is, what is the preferred way of loading modules
when they are needed. Do they have to be loaded on ExecStartPre= or as
a separate service which has ExecStart that uses kmod to load them?
Wouldn't it be useful to have something like ExecStartModule=?

Umut
Tom Gundersen
2014-12-16 15:59:20 UTC
Permalink
On Tue, Dec 16, 2014 at 4:54 PM, Umut Tezduyar Lindskog
Post by Umut Tezduyar Lindskog
The other thought is, what is the preferred way of loading modules
when they are needed.
Rely on kernel autoloading. Not all modules support that yet, but most
do. What do you have in mind?
Post by Umut Tezduyar Lindskog
Do they have to be loaded on ExecStartPre= or as
a separate service which has ExecStart that uses kmod to load them?
Wouldn't it be useful to have something like ExecStartModule=?
I'd much rather we improve the autoloading support...

Cheers,

Tom
Umut Tezduyar Lindskog
2014-12-16 16:21:23 UTC
Permalink
Post by Tom Gundersen
On Tue, Dec 16, 2014 at 4:54 PM, Umut Tezduyar Lindskog
Post by Umut Tezduyar Lindskog
The other thought is, what is the preferred way of loading modules
when they are needed.
Rely on kernel autoloading. Not all modules support that yet, but most
do. What do you have in mind?
We have some modules that we don't need them to be loaded so early. We
much prefer them to be loaded when they are needed. For example we
don't need to load the SD driver module until the service that uses SD
driver is starting. With this idea in mind I started some
investigation. Then I realized that our CPU utilization is not that
high during module loading and I blame it to the sequential loading of
modules. I am thinking this can be improved on systemd-modules-load
side.
Post by Tom Gundersen
Post by Umut Tezduyar Lindskog
Do they have to be loaded on ExecStartPre= or as
a separate service which has ExecStart that uses kmod to load them?
Wouldn't it be useful to have something like ExecStartModule=?
I'd much rather we improve the autoloading support...
My understanding is autoloading support is loading a module if the
hardware is available. What I am after is though loading the module
when they are needed.

Umut
Post by Tom Gundersen
Cheers,
Tom
Tom Gundersen
2014-12-20 15:56:54 UTC
Permalink
Post by Umut Tezduyar Lindskog
Post by Tom Gundersen
On Tue, Dec 16, 2014 at 4:54 PM, Umut Tezduyar Lindskog
Post by Umut Tezduyar Lindskog
The other thought is, what is the preferred way of loading modules
when they are needed.
Rely on kernel autoloading. Not all modules support that yet, but most
do. What do you have in mind?
We have some modules that we don't need them to be loaded so early. We
much prefer them to be loaded when they are needed. For example we
don't need to load the SD driver module until the service that uses SD
driver is starting. With this idea in mind I started some
investigation. Then I realized that our CPU utilization is not that
high during module loading and I blame it to the sequential loading of
modules. I am thinking this can be improved on systemd-modules-load
side.
We can probably improve the module loading by making it use worker
processes similar to how udev works. In principle this could cause problems
with things making assumptions on the order of module loading, so that is
something to keep in mind. That said, note that most modules will be loaded
by udev which already does it in parallel...
Post by Umut Tezduyar Lindskog
Post by Tom Gundersen
Post by Umut Tezduyar Lindskog
Do they have to be loaded on ExecStartPre= or as
a separate service which has ExecStart that uses kmod to load them?
Wouldn't it be useful to have something like ExecStartModule=?
I'd much rather we improve the autoloading support...
My understanding is autoloading support is loading a module if the
hardware is available.
That, or for non-hardware modules when the functionally is first used
(networking, filesystems, ...).
Post by Umut Tezduyar Lindskog
What I am after is though loading the module
when they are needed.
This sounds really fragile to me (having to encode this dependency
everywhere rather than just always assume the functionality is available).

Cheers,

Tom
Hoyer, Marko (ADITG/SW2)
2014-12-21 13:25:17 UTC
Permalink
-----Original Message-----
From: systemd-devel [mailto:systemd-devel-
Sent: Saturday, December 20, 2014 4:57 PM
To: Umut Tezduyar
Cc: systemd Mailing List
Subject: Re: [systemd-devel] Improving module loading
Post by Umut Tezduyar Lindskog
Post by Tom Gundersen
On Tue, Dec 16, 2014 at 4:54 PM, Umut Tezduyar Lindskog
Post by Umut Tezduyar Lindskog
The other thought is, what is the preferred way of loading modules
when they are needed.
Rely on kernel autoloading. Not all modules support that yet, but
most do. What do you have in mind?
We have some modules that we don't need them to be loaded so early.
We
Post by Umut Tezduyar Lindskog
much prefer them to be loaded when they are needed. For example we
don't need to load the SD driver module until the service that uses
SD
Post by Umut Tezduyar Lindskog
driver is starting. With this idea in mind I started some
investigation. Then I realized that our CPU utilization is not that
high during module loading and I blame it to the sequential loading
of
Post by Umut Tezduyar Lindskog
modules. I am thinking this can be improved on systemd-modules-load
side.
We can probably improve the module loading by making it use worker
processes similar to how udev works.
We realized it with threads, which are much cheaper for this job.
In principle this could cause
problems with things making assumptions on the order of module loading,
so that is something to keep in mind.
Mmm, I don't see any issues here since the dependencies are normally properly described on kernel side (otherwise you have a problem in any case). In worst case you are losing potential to parallelize loading of modules if your algorithm for distributing the modules to workers is not working efficiently.
That said, note that most modules
will be loaded by udev which already does it in parallel...
... only if you are still triggering "add" uevent through the complete device tree during startup, which is really expensive and does not go with the "load things not before they are actually needed" philosophy very well ...
Post by Umut Tezduyar Lindskog
Post by Tom Gundersen
Post by Umut Tezduyar Lindskog
Do they have to be loaded on ExecStartPre= or as a separate
service
Post by Umut Tezduyar Lindskog
Post by Tom Gundersen
Post by Umut Tezduyar Lindskog
which has ExecStart that uses kmod to load them?
Wouldn't it be useful to have something like ExecStartModule=?
I'd much rather we improve the autoloading support...
My understanding is autoloading support is loading a module if the
hardware is available.
That, or for non-hardware modules when the functionally is first used
(networking, filesystems, ...).
Post by Umut Tezduyar Lindskog
What I am after is though loading the module when they are needed.
This sounds really fragile to me (having to encode this dependency
everywhere rather than just always assume the functionality is
available).
That is actually the main challenge when this approach is applied. But the assumption you are talking about is in many cases a kind of facade only at least if your applications
- are not waiting for udev to completely settle after the coldplug trigger, or
- are able to deal with devices in a hotplug fashion.
Cheers,
Tom
Best regards

Marko Hoyer
Software Group II (ADITG/SW2)

Tel. +49 5121 49 6948
Jóhann B. Guðmundsson
2014-12-16 17:59:03 UTC
Permalink
Post by Umut Tezduyar Lindskog
Hi,
Is there a reason why systemd-modules-load is loading modules
sequentially? Few things can happen simultaneously like resolving the
symbols etc. Seems like modules_mutex is common on module loads which
gets locked up on few occasions throughout the execution of
sys_init_module.
The other thought is, what is the preferred way of loading modules
when they are needed. Do they have to be loaded on ExecStartPre= or as
a separate service which has ExecStart that uses kmod to load them?
Wouldn't it be useful to have something like ExecStartModule=?
Kernel modules should never be loaded from type systemd units

JBG
Hoyer, Marko (ADITG/SW2)
2014-12-20 10:45:34 UTC
Permalink
Hi,
-----Original Message-----
From: systemd-devel [mailto:systemd-devel-
Sent: Tuesday, December 16, 2014 4:55 PM
Subject: [systemd-devel] Improving module loading
Hi,
Is there a reason why systemd-modules-load is loading modules
sequentially? Few things can happen simultaneously like resolving the
symbols etc. Seems like modules_mutex is common on module loads which
gets locked up on few occasions throughout the execution of
sys_init_module.
We are actually doing this (in embedded systems which need to be up very fast with limited resources) and gained a lot. Mainly, IO and CPU can be better utilized by loading modules in parallel (one module is loaded while another one probes for hardware or is doing memory initialization).
The other thought is, what is the preferred way of loading modules when
they are needed. Do they have to be loaded on ExecStartPre= or as a
separate service which has ExecStart that uses kmod to load them?
Wouldn't it be useful to have something like ExecStartModule=?
I had such a discussion earlier with some of the systemd guys. My intention was to introduce an additional unit for module loading for exactly the reason you mentioned. The following (reasonable) outcome was:
- It is dangerous to load kernel modules from PID 1 since module loading can get stuck
- Since modules are actually loaded with the thread that calls the syscall, systemd would need additional threads
- Multi Threading is not really aimed in systemd for stability reasons
The probably safest way to do what you intended is to use an additional process to load your modules, which could be easily done by using ExecStartPre= in a service file. We are doing it exactly this way not with kmod but with a tool that loads modules in parallel.

Btw: Be careful with synchronization. We found that lots of kernel modules are exporting device nodes in the background (alsa, some graphics driver, ...). With the proceeding mentioned above, you are moving the kernel module loading and the actual use of the driver interface very close together in time. This might lead to race conditions. It is even worse when you need to access sys attributes, which are exported by some drivers even after the device is already available and uevents have been sent out. For such modules, there actually is no other way for synchronization but waiting for the attributes to appear.
Umut
_______________________________________________
systemd-devel mailing list
http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Best regards

Marko Hoyer
Software Group II (ADITG/SW2)

Tel. +49 5121 49 6948
Greg KH
2014-12-20 17:10:14 UTC
Permalink
Post by Hoyer, Marko (ADITG/SW2)
Hi,
-----Original Message-----
From: systemd-devel [mailto:systemd-devel-
Sent: Tuesday, December 16, 2014 4:55 PM
Subject: [systemd-devel] Improving module loading
Hi,
Is there a reason why systemd-modules-load is loading modules
sequentially? Few things can happen simultaneously like resolving the
symbols etc. Seems like modules_mutex is common on module loads which
gets locked up on few occasions throughout the execution of
sys_init_module.
We are actually doing this (in embedded systems which need to be up
very fast with limited resources) and gained a lot. Mainly, IO and CPU
can be better utilized by loading modules in parallel (one module is
loaded while another one probes for hardware or is doing memory
initialization).
If you have control over your kernel, why not just build the modules
into the kernel, then all of this isn't an issue at all and there is no
overhead of module loading?

greg k-h
Umut Tezduyar Lindskog
2014-12-20 17:40:49 UTC
Permalink
Post by Greg KH
Post by Hoyer, Marko (ADITG/SW2)
Hi,
-----Original Message-----
From: systemd-devel [mailto:systemd-devel-
Sent: Tuesday, December 16, 2014 4:55 PM
Subject: [systemd-devel] Improving module loading
Hi,
Is there a reason why systemd-modules-load is loading modules
sequentially? Few things can happen simultaneously like resolving the
symbols etc. Seems like modules_mutex is common on module loads which
gets locked up on few occasions throughout the execution of
sys_init_module.
We are actually doing this (in embedded systems which need to be up
very fast with limited resources) and gained a lot. Mainly, IO and CPU
can be better utilized by loading modules in parallel (one module is
loaded while another one probes for hardware or is doing memory
initialization).
If you have control over your kernel, why not just build the modules
into the kernel, then all of this isn't an issue at all and there is no
overhead of module loading?
For us, licenses are the problem.
Umut
Post by Greg KH
greg k-h
Greg KH
2014-12-20 19:10:40 UTC
Permalink
Post by Umut Tezduyar Lindskog
Post by Greg KH
Post by Hoyer, Marko (ADITG/SW2)
Hi,
-----Original Message-----
From: systemd-devel [mailto:systemd-devel-
Sent: Tuesday, December 16, 2014 4:55 PM
Subject: [systemd-devel] Improving module loading
Hi,
Is there a reason why systemd-modules-load is loading modules
sequentially? Few things can happen simultaneously like resolving the
symbols etc. Seems like modules_mutex is common on module loads which
gets locked up on few occasions throughout the execution of
sys_init_module.
We are actually doing this (in embedded systems which need to be up
very fast with limited resources) and gained a lot. Mainly, IO and CPU
can be better utilized by loading modules in parallel (one module is
loaded while another one probes for hardware or is doing memory
initialization).
If you have control over your kernel, why not just build the modules
into the kernel, then all of this isn't an issue at all and there is no
overhead of module loading?
For us, licenses are the problem.
Then you are on your own, and be prepared to deal with the legal issues
involved.

No sympathy here.

greg k-h
Hoyer, Marko (ADITG/SW2)
2014-12-21 12:31:30 UTC
Permalink
-----Original Message-----
Sent: Saturday, December 20, 2014 6:11 PM
To: Hoyer, Marko (ADITG/SW2)
Subject: Re: [systemd-devel] Improving module loading
Post by Hoyer, Marko (ADITG/SW2)
Hi,
-----Original Message-----
From: systemd-devel [mailto:systemd-devel-
Sent: Tuesday, December 16, 2014 4:55 PM
Subject: [systemd-devel] Improving module loading
Hi,
Is there a reason why systemd-modules-load is loading modules
sequentially? Few things can happen simultaneously like resolving
the symbols etc. Seems like modules_mutex is common on module loads
which gets locked up on few occasions throughout the execution of
sys_init_module.
We are actually doing this (in embedded systems which need to be up
very fast with limited resources) and gained a lot. Mainly, IO and
CPU
Post by Hoyer, Marko (ADITG/SW2)
can be better utilized by loading modules in parallel (one module is
loaded while another one probes for hardware or is doing memory
initialization).
If you have control over your kernel, why not just build the modules
into the kernel, then all of this isn't an issue at all and there is no
overhead of module loading?
It is a questions of kernel image size and startup performance.
- We are somehow limited in terms of size from where we are loading the kernel.
- Loading the image is a kind of monolithic block in terms of time where you can hardly do things in parallel
- We are strongly following the idea from Umut (loading things not before they are actually needed) to get up early services very early (e.g. rendering a camera on a display in less than 2secs after power on)
- Some modules do time / CPU consuming things in init(), which would delay the entry time into userspace
-> deferred init calls are not really a solution because they cannot be controlled in the needed granularity

So finally it is actually a trade of between compiling things in and spending the overhead of module loading to gain the flexibility to load things later.
greg k-h
Best regards

Marko Hoyer
Software Group II (ADITG/SW2)

Tel. +49 5121 49 6948
Greg KH
2014-12-21 17:47:04 UTC
Permalink
Post by Hoyer, Marko (ADITG/SW2)
Post by Greg KH
If you have control over your kernel, why not just build the modules
into the kernel, then all of this isn't an issue at all and there is no
overhead of module loading?
It is a questions of kernel image size and startup performance.
- We are somehow limited in terms of size from where we are loading the kernel.
What do you mean by this? What is limiting this? What is your limit?
How large are these kernel modules that you are having a hard time to
build into your kernel image?
Post by Hoyer, Marko (ADITG/SW2)
- Loading the image is a kind of monolithic block in terms of time where you can hardly do things in parallel
How long does loading a tiny kernel image actually take?
Post by Hoyer, Marko (ADITG/SW2)
- We are strongly following the idea from Umut (loading things not before they are actually needed) to get up early services very early (e.g. rendering a camera on a display in less than 2secs after power on)
Ah, IVI, you all have some really strange hardware configurations :(

There is no reason you have to do a "cold reset" to get your boot times
down, there is the fun "resume from a system image" solution that others
have done that can get that camera up and displayed in milliseconds.
Post by Hoyer, Marko (ADITG/SW2)
- Some modules do time / CPU consuming things in init(), which would delay the entry time into userspace
Then fix them, that's the best thing about Linux, you have the source to
not accept problems like this! And no module should do expensive things
in init(), we have been fixing issues like that for a while now.
Post by Hoyer, Marko (ADITG/SW2)
-> deferred init calls are not really a solution because they cannot be controlled in the needed granularity
We have loads of granularity there, how much more do you need?
Post by Hoyer, Marko (ADITG/SW2)
So finally it is actually a trade of between compiling things in and spending the overhead of module loading to gain the flexibility to load things later.
That's fine, but you will run into the kernel lock that prevents modules
loading at the same time for some critical sections, if your I/O issues
don't limit you already.

There are lots of areas you can work on to speed up boot times other
than worrying about multithreaded kernel module loading. I really doubt
this is going to be the final solution for your problems.

good luck,

greg k-h
Hoyer, Marko (ADITG/SW2)
2014-12-23 11:25:23 UTC
Permalink
Hi Greg,

thx a lot for the feedback and hints. You asked for lots of numbers, I tried to add some I have available here at the moment. Find them inline. I'm additionally interested in some more details of some of the ideas you outlined. Would be nice if you could go some more into details at certain points. I added some questions inline as well.
-----Original Message-----
Sent: Sunday, December 21, 2014 6:47 PM
To: Hoyer, Marko (ADITG/SW2)
Subject: Re: [systemd-devel] Improving module loading
Post by Hoyer, Marko (ADITG/SW2)
Post by Greg KH
If you have control over your kernel, why not just build the
modules
Post by Hoyer, Marko (ADITG/SW2)
Post by Greg KH
into the kernel, then all of this isn't an issue at all and there
is
Post by Hoyer, Marko (ADITG/SW2)
Post by Greg KH
no overhead of module loading?
It is a questions of kernel image size and startup performance.
- We are somehow limited in terms of size from where we are loading
the kernel.
What do you mean by this? What is limiting this? What is your limit?
How large are these kernel modules that you are having a hard time to
build into your kernel image?
- As far as I remember, we have special fastboot aware partitions on the emmc that are available very fast. But those are very limited in size. But with this point I'm pretty much not sure. This is something I got told.

- targeted kernel size: 2-3MB packed

- Kernel modules:
- we have heavy graphics drivers (~800kb, stripped), they are needed half the way at startup
- video processing unit drivers (don't know the size), they are needed half the way at startup
- wireless & bluetooth, they are needed very late
- usb subsystem, conventionally needed very late (but this finally depends on the concrete product)
- hot plug mass storage handling, conventionally needed very late (but this finally depends on the concrete product)
- audio driver, in most of our products needed very late
- some drivers for INC communication (partly needed very early -> we compiled in them, partly needed later -> we have them as module)

All in all I'd guess we are getting twice the size if we would compile in all the stuff.
Post by Hoyer, Marko (ADITG/SW2)
- Loading the image is a kind of monolithic block in terms of time
where you can hardly do things in parallel
How long does loading a tiny kernel image actually take?
I don't know exact numbers, sorry. I guess something between 50-200ms plus time for unpacking. But this loading and unpacking job is important since it is located directly on the critical path.
Post by Hoyer, Marko (ADITG/SW2)
- We are strongly following the idea from Umut (loading things not
before they are actually needed) to get up early services very early
(e.g. rendering a camera on a display in less than 2secs after power
on)
Ah, IVI, you all have some really strange hardware configurations :(
Yes IVI. Since we are developing our hardware as well as our software (different department), I'm interested in getting more infos about what is strange about IVI hardware configuration in general. Maybe we can improve things to a certain extent. Could you go more into details?
There is no reason you have to do a "cold reset" to get your boot times
down, there is the fun "resume from a system image" solution that
others have done that can get that camera up and displayed in
milliseconds.
I'm interested in this point.
- Are you talking about "Save To RAM", "Save to Disk", or a hybrid combination of both?
- Or do you have something completely different in mind?

I personally thought about such a solution as well. I'm by now not fully convinced since we have really hard timing requirements (partly motivated by law). So I see two different principal ways for a "resume" solution:
- either the resume solution is robust enough to guarantee to come up properly every boot up
- achieved for instance by a static system image that brings the system into a static state very fast, from which on a kind of conventional boot is going on then ...
- or the boot up after an actual "cold reset" is fast enough to at least guarantee the really important timing requirements in case the resume is not coming up properly
Post by Hoyer, Marko (ADITG/SW2)
- Some modules do time / CPU consuming things in init(), which would
delay the entry time into userspace
Then fix them, that's the best thing about Linux, you have the source
to not accept problems like this! And no module should do expensive
things in init(), we have been fixing issues like that for a while now.
This would be properly the cleanest solution. In a long term perspective we are of course going this way and we are trying to get suppliers to go this way with us as well. But finally, we have to bring up now products at a fixed date. So it sometimes is easier, and more stable to work around suboptimal things.

For instance:
- refactoring a driver that is doing lots of CPU intensive things in init()
vs.
- taking the module as it is and using the time by loading things from emmc in parallel
Post by Hoyer, Marko (ADITG/SW2)
-> deferred init calls are not really a solution because they
cannot
Post by Hoyer, Marko (ADITG/SW2)
be controlled in the needed granularity
We have loads of granularity there, how much more do you need?
Post by Hoyer, Marko (ADITG/SW2)
So finally it is actually a trade of between compiling things in and
spending the overhead of module loading to gain the flexibility to load
things later.
That's fine, but you will run into the kernel lock that prevents
modules loading at the same time for some critical sections, if your
I/O issues don't limit you already.
There are lots of areas you can work on to speed up boot times other
than worrying about multithreaded kernel module loading. I really
doubt this is going to be the final solution for your problems.
It is of course not. The initial intention to develop something new here on top of kmod or systemd-modules-load was not to load kernel modules in parallel. We found that for lots of our modules we can actually gain some benefit by loading things in parallel so we decided to include the threaded approach as well.

The initial motivation to develop something new here was to get rid of using the "udevd" / "udevadm trigger" approach during startup. This "setting up the system hardware completely in one early phase during startup" approach is not going well with our timing requirements. So we are setting up our static hardware piece by piece exactly at this point in time when it is needed using our tool. Besides the actually module loading, this tool provides a mechanism for synchronization, and is doing some addition setup stuff on top. The threaded loading is just a feature.

Some numbers to the above mentioned arguments:
- in an idle system, systemd-udev-trigger.service takes by about 150-200ms just to get the complete device tree to send out "add" uevents again (udevd was deactivated while doing this measure)
- the processing of the resulting uevents by udevd takes 1-2 seconds (with the default rule set) again in an idle system
- in a general solution, we'd need to wait for udevd to completely settle until we can start using the devices
good luck,
greg k-h
Thx ;)
Tom Gundersen
2014-12-23 14:22:49 UTC
Permalink
On Tue, Dec 23, 2014 at 12:25 PM, Hoyer, Marko (ADITG/SW2)
Post by Hoyer, Marko (ADITG/SW2)
- in an idle system, systemd-udev-trigger.service takes by about 150-200ms just to get the complete device tree to send out "add" uevents again (udevd was deactivated while doing this measure)
I'm working on optimizing this a bit (though we are obviously talking
about percentage points, not orders of magnitude). I'd be interested
in seeing if it could be further optimized to help your usecase once I
have pushed out the basic rework.
Post by Hoyer, Marko (ADITG/SW2)
- the processing of the resulting uevents by udevd takes 1-2 seconds (with the default rule set) again in an idle system
I haven't looked at this much recently, but if you find any
bottlenecks, do shout so we can look at improving it if possible.
Post by Hoyer, Marko (ADITG/SW2)
- in a general solution, we'd need to wait for udevd to completely settle until we can start using the devices
Hm, I would question this. In a semi-modern system you should not need
to wait for settle at all, as everything should be hotplug capable. If
you are in total control of your software (which I assume you are),
you should be able to fix the non-hotplug capable software and drop
the settling. Have you tried this? What is the blocker here?

If you drop settling, then I guess the biggest tweak you can do is
with the order in which devices are triggered. I'm not seeing how we
can do something sensible about this in a generic fashion, but might
be worth fiddling with to figure out what is the optimal order on your
system (to have a baseline for future optimizations).

Cheers,

Tom
Kay Sievers
2014-12-23 16:01:01 UTC
Permalink
Post by Tom Gundersen
On Tue, Dec 23, 2014 at 12:25 PM, Hoyer, Marko (ADITG/SW2)
Post by Hoyer, Marko (ADITG/SW2)
- in an idle system, systemd-udev-trigger.service takes by about 150-200ms just to get the complete device tree to send out "add" uevents again (udevd was deactivated while doing this measure)
I'm working on optimizing this a bit (though we are obviously talking
about percentage points, not orders of magnitude). I'd be interested
in seeing if it could be further optimized to help your usecase once I
have pushed out the basic rework.
This is in most cases caused by conceptually broken implementations of
drivers or subsystems which call into the driver or event the firmware
when the "uevent" file is read by userspace.

Subsystems like "power_supply" had the great idea to transport device
measurement data over uevents. This can't be fixed from userspace, it
would need to be fixed in the kernel.

Kay
Tom Gundersen
2014-12-23 14:45:32 UTC
Permalink
On Tue, Dec 23, 2014 at 12:25 PM, Hoyer, Marko (ADITG/SW2)
Post by Hoyer, Marko (ADITG/SW2)
- in an idle system, systemd-udev-trigger.service takes by about 150-200ms just to get the complete device tree to send out "add" uevents again (udevd was deactivated while doing this measure)
I'm working on optimizing this a bit (though we are obviously talking
about percentage points, not orders of magnitude). I'd be interested
in seeing if it could be further optimized to help your usecase once I
have pushed out the basic rework.
Post by Hoyer, Marko (ADITG/SW2)
- the processing of the resulting uevents by udevd takes 1-2 seconds (with the default rule set) again in an idle system
I haven't looked at this much recently, but if you find any
bottlenecks, do shout so we can look at improving it if possible.
Post by Hoyer, Marko (ADITG/SW2)
- in a general solution, we'd need to wait for udevd to completely settle until we can start using the devices
Hm, I would question this. In a semi-modern system you should not need
to wait for settle at all, as everything should be hotplug capable. If
you are in total control of your software (which I assume you are),
you should be able to fix the non-hotplug capable software and drop
the settling. Have you tried this? What is the blocker here?

If you drop settling, then I guess the biggest tweak you can do is
with the order in which devices are triggered. I'm not seeing how we
can do something sensible about this in a generic fashion, but might
be worth fiddling with to figure out what is the optimal order on your
system (to have a baseline for future optimizations).

Cheers,

Tom
Greg KH
2014-12-23 20:55:50 UTC
Permalink
Post by Hoyer, Marko (ADITG/SW2)
Post by Greg KH
What do you mean by this? What is limiting this? What is your limit?
How large are these kernel modules that you are having a hard time to
build into your kernel image?
- As far as I remember, we have special fastboot aware partitions on the emmc that are available very fast. But those are very limited in size. But with this point I'm pretty much not sure. This is something I got told.
- targeted kernel size: 2-3MB packed
- we have heavy graphics drivers (~800kb, stripped), they are needed half the way at startup
- video processing unit drivers (don't know the size), they are needed half the way at startup
- wireless & bluetooth, they are needed very late
- usb subsystem, conventionally needed very late (but this finally depends on the concrete product)
- hot plug mass storage handling, conventionally needed very late (but this finally depends on the concrete product)
- audio driver, in most of our products needed very late
- some drivers for INC communication (partly needed very early -> we compiled in them, partly needed later -> we have them as module)
All in all I'd guess we are getting twice the size if we would compile in all the stuff.
All of those should dynamically be loaded when the hardware is found by
the system, so I don't see why you are trying to load them "by hand".
Post by Hoyer, Marko (ADITG/SW2)
Post by Greg KH
Post by Hoyer, Marko (ADITG/SW2)
- Loading the image is a kind of monolithic block in terms of time
where you can hardly do things in parallel
How long does loading a tiny kernel image actually take?
I don't know exact numbers, sorry. I guess something between 50-200ms
plus time for unpacking. But this loading and unpacking job is
important since it is located directly on the critical path.
Exact numbers matter :)
Post by Hoyer, Marko (ADITG/SW2)
Post by Greg KH
Post by Hoyer, Marko (ADITG/SW2)
- We are strongly following the idea from Umut (loading things not
before they are actually needed) to get up early services very early
(e.g. rendering a camera on a display in less than 2secs after power
on)
Ah, IVI, you all have some really strange hardware configurations :(
Yes IVI. Since we are developing our hardware as well as our software
(different department), I'm interested in getting more infos about
what is strange about IVI hardware configuration in general. Maybe we
can improve things to a certain extent. Could you go more into
details?
Traditionally IVI systems are _very_ underpowered, use old processors,
have tiny storage systems on very slow interfaces, very little memory,
huge numbers of IPC calls at startup due to legacy userspace programs
written originally for other operating systems, and expect to get
high-performance results out of bad hardware decisions.
Post by Hoyer, Marko (ADITG/SW2)
Post by Greg KH
There is no reason you have to do a "cold reset" to get your boot times
down, there is the fun "resume from a system image" solution that
others have done that can get that camera up and displayed in
milliseconds.
I'm interested in this point.
- Are you talking about "Save To RAM", "Save to Disk", or a hybrid combination of both?
- Or do you have something completely different in mind?
A number of devices in the past have done a "save system image" to
flash, and then when starting up, just load the system image into memory
and jump into it, everything up and running with no "startup time"
needed other than the initial memory load.

When updating the system with new software, just be sure to write out a
new "initial system state" image at the same time, and you should be
fine for your next boot.

This idea / implementation has been around for a very long time, and has
shipped in lots of devices, solving the "image in x seconds from power
on" problem quite easily.
Post by Hoyer, Marko (ADITG/SW2)
- either the resume solution is robust enough to guarantee to come up properly every boot up
- achieved for instance by a static system image that brings the system into a static state very fast, from which on a kind of conventional boot is going on then ...
This is the easiest to do.
Post by Hoyer, Marko (ADITG/SW2)
- or the boot up after an actual "cold reset" is fast enough to at least guarantee the really important timing requirements in case the resume is not coming up properly
If you have good hardware, do this, but odds are, your hardware can't do
this, otherwise we wouldn't be having this whole conversation :)
Post by Hoyer, Marko (ADITG/SW2)
Post by Greg KH
Post by Hoyer, Marko (ADITG/SW2)
- Some modules do time / CPU consuming things in init(), which would
delay the entry time into userspace
Then fix them, that's the best thing about Linux, you have the source
to not accept problems like this! And no module should do expensive
things in init(), we have been fixing issues like that for a while now.
This would be properly the cleanest solution. In a long term
perspective we are of course going this way and we are trying to get
suppliers to go this way with us as well. But finally, we have to
bring up now products at a fixed date. So it sometimes is easier, and
more stable to work around suboptimal things.
As you have the source to everything, this shouldn't be an issue.
Post by Hoyer, Marko (ADITG/SW2)
- refactoring a driver that is doing lots of CPU intensive things in init()
No driver should be doing this, if it does, push back on the supplier of
it as unacceptable.
Post by Hoyer, Marko (ADITG/SW2)
vs.
- taking the module as it is and using the time by loading things from emmc in parallel
Work around crappy code? No, push back, as it obviously doesn't work
properly.
Post by Hoyer, Marko (ADITG/SW2)
Post by Greg KH
Post by Hoyer, Marko (ADITG/SW2)
-> deferred init calls are not really a solution because they
cannot
Post by Hoyer, Marko (ADITG/SW2)
be controlled in the needed granularity
We have loads of granularity there, how much more do you need?
Post by Hoyer, Marko (ADITG/SW2)
So finally it is actually a trade of between compiling things in and
spending the overhead of module loading to gain the flexibility to load
things later.
That's fine, but you will run into the kernel lock that prevents
modules loading at the same time for some critical sections, if your
I/O issues don't limit you already.
There are lots of areas you can work on to speed up boot times other
than worrying about multithreaded kernel module loading. I really
doubt this is going to be the final solution for your problems.
It is of course not. The initial intention to develop something new
here on top of kmod or systemd-modules-load was not to load kernel
modules in parallel. We found that for lots of our modules we can
actually gain some benefit by loading things in parallel so we decided
to include the threaded approach as well.
The initial motivation to develop something new here was to get rid of
using the "udevd" / "udevadm trigger" approach during startup. This
"setting up the system hardware completely in one early phase during
startup" approach is not going well with our timing requirements. So
we are setting up our static hardware piece by piece exactly at this
point in time when it is needed using our tool. Besides the actually
module loading, this tool provides a mechanism for synchronization,
and is doing some addition setup stuff on top. The threaded loading is
just a feature.
- in an idle system, systemd-udev-trigger.service takes by about
150-200ms just to get the complete device tree to send out "add"
uevents again (udevd was deactivated while doing this measure)
This all depends on the size of your device tree.
Post by Hoyer, Marko (ADITG/SW2)
- the processing of the resulting uevents by udevd takes 1-2 seconds
(with the default rule set) again in an idle system
That's a very long time, something is wrong there, do you have too many
rules with lots of userspace processes being called out?
Post by Hoyer, Marko (ADITG/SW2)
- in a general solution, we'd need to wait for udevd to completely
settle until we can start using the devices
As others have pointed out, you should never have to wait for 'udevd to
settle' if so, something is wrong with your kernel drivers, or your
system design.

best of luck,

greg k-h
Umut Tezduyar Lindskog
2014-12-20 17:44:42 UTC
Permalink
Hi Marko,

Thank you very much for your feedback!

On Sat, Dec 20, 2014 at 11:45 AM, Hoyer, Marko (ADITG/SW2)
Post by Hoyer, Marko (ADITG/SW2)
Hi,
-----Original Message-----
From: systemd-devel [mailto:systemd-devel-
Sent: Tuesday, December 16, 2014 4:55 PM
Subject: [systemd-devel] Improving module loading
Hi,
Is there a reason why systemd-modules-load is loading modules
sequentially? Few things can happen simultaneously like resolving the
symbols etc. Seems like modules_mutex is common on module loads which
gets locked up on few occasions throughout the execution of
sys_init_module.
We are actually doing this (in embedded systems which need to be up very fast with limited resources) and gained a lot. Mainly, IO and CPU can be better utilized by loading modules in parallel (one module is loaded while another one probes for hardware or is doing memory initialization).
The other thought is, what is the preferred way of loading modules when
they are needed. Do they have to be loaded on ExecStartPre= or as a
separate service which has ExecStart that uses kmod to load them?
Wouldn't it be useful to have something like ExecStartModule=?
Do you have links for the discussions, I cannot find them. systemd
already has a service that loads the modules.
Post by Hoyer, Marko (ADITG/SW2)
- It is dangerous to load kernel modules from PID 1 since module loading can get stuck
- Since modules are actually loaded with the thread that calls the syscall, systemd would need additional threads
- Multi Threading is not really aimed in systemd for stability reasons
The probably safest way to do what you intended is to use an additional process to load your modules, which could be easily done by using ExecStartPre= in a service file. We are doing it exactly this way not with kmod but with a tool that loads modules in parallel.
Btw: Be careful with synchronization. We found that lots of kernel modules are exporting device nodes in the background (alsa, some graphics driver, ...). With the proceeding mentioned above, you are moving the kernel module loading and the actual use of the driver interface very close together in time. This might lead to race conditions. It is even worse when you need to access sys attributes, which are exported by some drivers even after the device is already available and uevents have been sent out. For such modules, there actually is no other way for synchronization but waiting for the attributes to appear.
We are aware of the potential complications and races. But good to be
reminded :)

Umut
Post by Hoyer, Marko (ADITG/SW2)
Umut
_______________________________________________
systemd-devel mailing list
http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Best regards
Marko Hoyer
Software Group II (ADITG/SW2)
Tel. +49 5121 49 6948
Hoyer, Marko (ADITG/SW2)
2014-12-21 13:03:36 UTC
Permalink
-----Original Message-----
Sent: Saturday, December 20, 2014 6:45 PM
To: Hoyer, Marko (ADITG/SW2)
Subject: Re: [systemd-devel] Improving module loading
Hi Marko,
Thank you very much for your feedback!
You're welcome ;)
On Sat, Dec 20, 2014 at 11:45 AM, Hoyer, Marko (ADITG/SW2)
Post by Hoyer, Marko (ADITG/SW2)
Hi,
-----Original Message-----
From: systemd-devel [mailto:systemd-devel-
Sent: Tuesday, December 16, 2014 4:55 PM
Subject: [systemd-devel] Improving module loading
Hi,
Is there a reason why systemd-modules-load is loading modules
sequentially? Few things can happen simultaneously like resolving
the
Post by Hoyer, Marko (ADITG/SW2)
symbols etc. Seems like modules_mutex is common on module loads
which
Post by Hoyer, Marko (ADITG/SW2)
gets locked up on few occasions throughout the execution of
sys_init_module.
We are actually doing this (in embedded systems which need to be up
very fast with limited resources) and gained a lot. Mainly, IO and CPU
can be better utilized by loading modules in parallel (one module is
loaded while another one probes for hardware or is doing memory
initialization).
Post by Hoyer, Marko (ADITG/SW2)
The other thought is, what is the preferred way of loading modules
when they are needed. Do they have to be loaded on ExecStartPre= or
as a separate service which has ExecStart that uses kmod to load
them?
Post by Hoyer, Marko (ADITG/SW2)
Wouldn't it be useful to have something like ExecStartModule=?
I had such a discussion earlier with some of the systemd guys. My
intention was to introduce an additional unit for module loading for
exactly the reason you mentioned. The following (reasonable) outcome
Do you have links for the discussions, I cannot find them.
Actually not, sorry. The discussion was not done via any mailing list.
systemd already has a service that loads the modules.
Sorry, there is a word missing in my sentence above. My idea was not to introduce a "unit" for modules loading but an own "unit type", such as .kmodule. The idea was to define .kmodule units to load one or a set of kernel modules each at a certain point during startup by just integrating them into the startup dependency tree. This idea would require integrating kind of worker threads into systemd. The outcome was as summarized below.

The advantages over systemd-modules-load are:
- modules can be loaded in parallel
- different sets of modules can be loaded at different points in time during startup
Post by Hoyer, Marko (ADITG/SW2)
- It is dangerous to load kernel modules from PID 1 since module loading can get stuck
- Since modules are actually loaded with the thread that calls the
syscall, systemd would need additional threads
- Multi Threading is not really aimed in systemd for stability
reasons
Post by Hoyer, Marko (ADITG/SW2)
The probably safest way to do what you intended is to use an
additional process to load your modules, which could be easily done by
using ExecStartPre= in a service file. We are doing it exactly this way
not with kmod but with a tool that loads modules in parallel.
Post by Hoyer, Marko (ADITG/SW2)
Btw: Be careful with synchronization. We found that lots of kernel
modules are exporting device nodes in the background (alsa, some
graphics driver, ...). With the proceeding mentioned above, you are
moving the kernel module loading and the actual use of the driver
interface very close together in time. This might lead to race
conditions. It is even worse when you need to access sys attributes,
which are exported by some drivers even after the device is already
available and uevents have been sent out. For such modules, there
actually is no other way for synchronization but waiting for the
attributes to appear.
We are aware of the potential complications and races. But good to be
reminded :)
;) We actually stumbled over lots of things here while we rolled out this approach. Sometimes it is really funny that simple questions such as "What does your service actually need?" are hard to answer. It seems that sometimes things are working more or less accidently due to the fact that the udev trigger comes very early compared to the startup of the services.
Umut
Post by Hoyer, Marko (ADITG/SW2)
Umut
_______________________________________________
systemd-devel mailing list
http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Best regards
Marko Hoyer
Software Group II (ADITG/SW2)
Tel. +49 5121 49 6948
Best regards

Marko Hoyer
Software Group II (ADITG/SW2)

Tel. +49 5121 49 6948
Ivan Shapovalov
2014-12-21 14:25:58 UTC
Permalink
Post by Hoyer, Marko (ADITG/SW2)
-----Original Message-----
Sent: Saturday, December 20, 2014 6:45 PM
To: Hoyer, Marko (ADITG/SW2)
Subject: Re: [systemd-devel] Improving module loading
[...]
Post by Hoyer, Marko (ADITG/SW2)
I had such a discussion earlier with some of the systemd guys. My
intention was to introduce an additional unit for module loading for
exactly the reason you mentioned. The following (reasonable) outcome
Do you have links for the discussions, I cannot find them.
Actually not, sorry. The discussion was not done via any mailing list.
systemd already has a service that loads the modules.
Sorry, there is a word missing in my sentence above. My idea was not to introduce a "unit" for modules loading but an own "unit type", such as .kmodule. The idea was to define .kmodule units to load one or a set of kernel modules each at a certain point during startup by just integrating them into the startup dependency tree. This idea would require integrating kind of worker threads into systemd. The outcome was as summarized below.
Why would you need a separate unit type for that?

load-***@.service:

[Unit]
Description=Load kernel module %I
DefaultDependencies=no

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/usr/bin/modprobe %I

...then add a dependency like Required=load-***@foo.service and After=load-***@foo.service.
--
Ivan Shapovalov / intelfx /
Hoyer, Marko (ADITG/SW2)
2014-12-21 16:42:23 UTC
Permalink
-----Original Message-----
Sent: Sunday, December 21, 2014 3:26 PM
Cc: Hoyer, Marko (ADITG/SW2); Umut Tezduyar Lindskog
Subject: Re: [systemd-devel] Improving module loading
Post by Hoyer, Marko (ADITG/SW2)
-----Original Message-----
Sent: Saturday, December 20, 2014 6:45 PM
To: Hoyer, Marko (ADITG/SW2)
Subject: Re: [systemd-devel] Improving module loading
[...]
Post by Hoyer, Marko (ADITG/SW2)
I had such a discussion earlier with some of the systemd guys. My
intention was to introduce an additional unit for module loading
for
Post by Hoyer, Marko (ADITG/SW2)
exactly the reason you mentioned. The following (reasonable)
outcome
Post by Hoyer, Marko (ADITG/SW2)
Do you have links for the discussions, I cannot find them.
Actually not, sorry. The discussion was not done via any mailing
list.
Post by Hoyer, Marko (ADITG/SW2)
systemd already has a service that loads the modules.
Sorry, there is a word missing in my sentence above. My idea was not
to introduce a "unit" for modules loading but an own "unit type", such
as .kmodule. The idea was to define .kmodule units to load one or a set
of kernel modules each at a certain point during startup by just
integrating them into the startup dependency tree. This idea would
require integrating kind of worker threads into systemd. The outcome
was as summarized below.
Why would you need a separate unit type for that?
[Unit]
Description=Load kernel module %I
DefaultDependencies=no
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/usr/bin/modprobe %I
To prevent forking a process for that ... We earlier had some issue with cgroups in the kernel, which caused between 20 and 60ms delay per process executed by systemd.

But actually we are doing it now exactly this way but not with modprobe but another tool, which can load modules in parallel, takes care for synchronization (devices and attributes), and does some other stuff as well ...

In some cases, we don't even have an additional unit for that. We are just putting the kmod call with an ExecStartPre= Statement into the service file, which requires the module / modules being load before.
--
Ivan Shapovalov / intelfx /
Best regards

Marko Hoyer
Software Group II (ADITG/SW2)

Tel. +49 5121 49 6948
Lennart Poettering
2014-12-22 15:04:55 UTC
Permalink
Post by Hoyer, Marko (ADITG/SW2)
I had such a discussion earlier with some of the systemd guys. My
intention was to introduce an additional unit for module loading for
exactly the reason you mentioned. The following (reasonable) outcome
- It is dangerous to load kernel modules from PID 1 since module loading can get stuck
- Since modules are actually loaded with the thread that calls the
syscall, systemd would need additional threads
- Multi Threading is not really aimed in systemd for stability reasons
The probably safest way to do what you intended is to use an
additional process to load your modules, which could be easily done
by using ExecStartPre= in a service file. We are doing it exactly
this way not with kmod but with a tool that loads modules in
parallel.
I'd be willing to merge a good patch that beefs up
systemd-modules-load to load the specified modules in parallel, with
one thread for each.

We already have a very limited number of threaded bits in systemd, and
I figure out would be OK to do that for this too.

Please keep the threading minimal though, i.e. one kmod context per
thread, so that we need no synchronization and no locking. One thread
per module, i.e. no worker thread logic with thread reusing. also,
please set a thred name, so that hanging module loading only hang one
specific thread and the backtrace shows which module is at fault.

Lennart
--
Lennart Poettering, Red Hat
Lucas De Marchi
2014-12-22 17:59:57 UTC
Permalink
On Mon, Dec 22, 2014 at 1:04 PM, Lennart Poettering
Post by Lennart Poettering
Post by Hoyer, Marko (ADITG/SW2)
I had such a discussion earlier with some of the systemd guys. My
intention was to introduce an additional unit for module loading for
exactly the reason you mentioned. The following (reasonable) outcome
- It is dangerous to load kernel modules from PID 1 since module loading can get stuck
- Since modules are actually loaded with the thread that calls the
syscall, systemd would need additional threads
- Multi Threading is not really aimed in systemd for stability reasons
The probably safest way to do what you intended is to use an
additional process to load your modules, which could be easily done
by using ExecStartPre= in a service file. We are doing it exactly
this way not with kmod but with a tool that loads modules in
parallel.
I'd be willing to merge a good patch that beefs up
systemd-modules-load to load the specified modules in parallel, with
one thread for each.
We already have a very limited number of threaded bits in systemd, and
I figure out would be OK to do that for this too.
Please keep the threading minimal though, i.e. one kmod context per
thread, so that we need no synchronization and no locking. One thread
per module, i.e. no worker thread logic with thread reusing. also,
please set a thred name, so that hanging module loading only hang one
specific thread and the backtrace shows which module is at fault.
I'm skeptical you would get any speed up for that. I think it would be
better to have some numbers shared before merging such a thing.

If you have 1 context per module/thread you will need to initialize
each context which is really the most expensive part in userspace,
particularly if finit_module() is being used (which you should unless
you have restrictions on the physical size taken by the modules). Bare
in mind the udev logic has only 1 context, so the initialization is
amortized among the multiple module load calls.

For the "don't load until it's needed" I very much prefer the static
nodes approach we have. Shouldn't this be used instead of filling
modules-load-d with lots of entries?

I really miss numbers here and more information on which modules are
taking long because they are serialized.
--
Lucas De Marchi
Hoyer, Marko (ADITG/SW2)
2014-12-23 12:21:32 UTC
Permalink
-----Original Message-----
Sent: Monday, December 22, 2014 7:00 PM
To: Lennart Poettering
Subject: Re: [systemd-devel] Improving module loading
On Mon, Dec 22, 2014 at 1:04 PM, Lennart Poettering
Post by Lennart Poettering
Post by Hoyer, Marko (ADITG/SW2)
I had such a discussion earlier with some of the systemd guys. My
intention was to introduce an additional unit for module loading for
exactly the reason you mentioned. The following (reasonable) outcome
- It is dangerous to load kernel modules from PID 1 since module
loading can get stuck
- Since modules are actually loaded with the thread that calls the
syscall, systemd would need additional threads
- Multi Threading is not really aimed in systemd for stability reasons
The probably safest way to do what you intended is to use an
additional process to load your modules, which could be easily done
by using ExecStartPre= in a service file. We are doing it exactly
this way not with kmod but with a tool that loads modules in
parallel.
I'd be willing to merge a good patch that beefs up
systemd-modules-load to load the specified modules in parallel, with
one thread for each.
We already have a very limited number of threaded bits in systemd,
and
Post by Lennart Poettering
I figure out would be OK to do that for this too.
Please keep the threading minimal though, i.e. one kmod context per
thread, so that we need no synchronization and no locking. One thread
per module, i.e. no worker thread logic with thread reusing. also,
please set a thred name, so that hanging module loading only hang one
specific thread and the backtrace shows which module is at fault.
I'm skeptical you would get any speed up for that. I think it would be
better to have some numbers shared before merging such a thing.
As I already outlined in my answer to Greg, the parallel loading was not our main motivation for inventing something new. We found that for some of our modules parallel loading gained us benefit, so we integrated this feature. Since we are not using udevd during startup at all, most of our modules are loaded manually. I've no idea how things are distributed between systemd-modules-load and udevd in conventional Linux desktop or server systems. If only a hand full of modules are actually loaded using systemd-modules-load, it is probably not worth optimizing at this end.

Has someone concrete numbers how many modules are loaded "by hand" using systemd-modules-load in a conventional system?
If you have 1 context per module/thread you will need to initialize
each context which is really the most expensive part in userspace,
particularly if finit_module() is being used (which you should unless
you have restrictions on the physical size taken by the modules). Bare
in mind the udev logic has only 1 context, so the initialization is
amortized among the multiple module load calls.
This does not really meet my experience. Once the kmod binary cache is in the VFS page buffer cache, it is really fast getting a new context even in new processes. The expensive thing about udev is that it starts very fast forking off worker processes. So at least one new context per process is created finally too. Additionally, the people who decide to use systemd-modules-load to load specific modules have good reasons for that. A prominent one is probably that udevd is not working for the respective module because no concrete device is coupled with it. I think we do not have so many kernel modules, which need to be handled like this which brings us again to the question if it is really worth pimping systemd-modules-load.
For the "don't load until it's needed" I very much prefer the static
nodes approach we have. Shouldn't this be used instead of filling
modules-load-d with lots of entries?
We are not using systemd-modules-load for applying this approach since it is trying to load all modules in one shot. We are executing our tool several times during startup to get up hardware piece by piece exactly at the point where it is needed. The tool is either executed like modprobe or with a configuration file containing a set of modules to be loaded in one shot and some other stuff needed for synchronization and setup.
I really miss numbers here and more information on which modules are
taking long because they are serialized.
--
Lucas De Marchi
Best regards

Marko Hoyer
Software Group II (ADITG/SW2)

Tel. +49 5121 49 6948
Tom Gundersen
2014-12-23 14:37:50 UTC
Permalink
On Tue, Dec 23, 2014 at 1:21 PM, Hoyer, Marko (ADITG/SW2)
Post by Hoyer, Marko (ADITG/SW2)
-----Original Message-----
Sent: Monday, December 22, 2014 7:00 PM
To: Lennart Poettering
Subject: Re: [systemd-devel] Improving module loading
On Mon, Dec 22, 2014 at 1:04 PM, Lennart Poettering
Post by Lennart Poettering
Post by Hoyer, Marko (ADITG/SW2)
I had such a discussion earlier with some of the systemd guys. My
intention was to introduce an additional unit for module loading for
exactly the reason you mentioned. The following (reasonable) outcome
- It is dangerous to load kernel modules from PID 1 since module
loading can get stuck
- Since modules are actually loaded with the thread that calls the
syscall, systemd would need additional threads
- Multi Threading is not really aimed in systemd for stability reasons
The probably safest way to do what you intended is to use an
additional process to load your modules, which could be easily done
by using ExecStartPre= in a service file. We are doing it exactly
this way not with kmod but with a tool that loads modules in
parallel.
I'd be willing to merge a good patch that beefs up
systemd-modules-load to load the specified modules in parallel, with
one thread for each.
We already have a very limited number of threaded bits in systemd,
and
Post by Lennart Poettering
I figure out would be OK to do that for this too.
Please keep the threading minimal though, i.e. one kmod context per
thread, so that we need no synchronization and no locking. One thread
per module, i.e. no worker thread logic with thread reusing. also,
please set a thred name, so that hanging module loading only hang one
specific thread and the backtrace shows which module is at fault.
I'm skeptical you would get any speed up for that. I think it would be
better to have some numbers shared before merging such a thing.
As I already outlined in my answer to Greg, the parallel loading was not our main motivation for inventing something new. We found that for some of our modules parallel loading gained us benefit, so we integrated this feature. Since we are not using udevd during startup at all, most of our modules are loaded manually. I've no idea how things are distributed between systemd-modules-load and udevd in conventional Linux desktop or server systems. If only a hand full of modules are actually loaded using systemd-modules-load, it is probably not worth optimizing at this end.
Has someone concrete numbers how many modules are loaded "by hand" using systemd-modules-load in a conventional system?
In a stock Fedora/Arch (and probably others, but didn't check)
systemd-modules-load is not used at all. It is mostly there to make it
simple to work around sub-par kernel modules, but most have been fixed
by now, so it is increasingly irrelevant.

I agree that it is probably not worth doing lots of
systemd-modules-load-specific hacks to speed it up, but if we split
out the worker-pool logic from udev (which I'm currently working at as
we need it in more places), we can optimize that in a generic way and
if the numbers show that systemd-modules-load would benefit from using
it, I'd be all for hooking that up too (as doing so would then be
trivial).
Post by Hoyer, Marko (ADITG/SW2)
If you have 1 context per module/thread you will need to initialize
each context which is really the most expensive part in userspace,
particularly if finit_module() is being used (which you should unless
you have restrictions on the physical size taken by the modules). Bare
in mind the udev logic has only 1 context, so the initialization is
amortized among the multiple module load calls.
This does not really meet my experience. Once the kmod binary cache is in the VFS page buffer cache, it is really fast getting a new context even in new processes. The expensive thing about udev is that it starts very fast forking off worker processes. So at least one new context per process is created finally too. Additionally, the people who decide to use systemd-modules-load to load specific modules have good reasons for that. A prominent one is probably that udevd is not working for the respective module because no concrete device is coupled with it. I think we do not have so many kernel modules, which need to be handled like this which brings us again to the question if it is really worth pimping systemd-modules-load.
I'm not aware of any kernel modules that legitimately needs to be
loaded in this way (i.e., all the ones that do can/should be fixed).
Post by Hoyer, Marko (ADITG/SW2)
For the "don't load until it's needed" I very much prefer the static
nodes approach we have. Shouldn't this be used instead of filling
modules-load-d with lots of entries?
We are not using systemd-modules-load for applying this approach since it is trying to load all modules in one shot. We are executing our tool several times during startup to get up hardware piece by piece exactly at the point where it is needed. The tool is either executed like modprobe or with a configuration file containing a set of modules to be loaded in one shot and some other stuff needed for synchronization and setup.
I'd be interested in understanding better the need for this
serialization. What bottleneck do you hit if you load modules eagerly?
CPU/IO? Could this be worked around by tweaking udev (limiting the
number of workers, fiddling with the order of triggering, tweaking
cgroup properties of the processes doing the loading,...)? It would be
nice to understand how a generic solution to this would look like so
we could consider if there is anything we want to improve in udev
here...

Cheers,

Tom
Hoyer, Marko (ADITG/SW2)
2014-12-23 11:53:01 UTC
Permalink
-----Original Message-----
Sent: Monday, December 22, 2014 4:05 PM
To: Hoyer, Marko (ADITG/SW2)
Subject: Re: [systemd-devel] Improving module loading
Post by Hoyer, Marko (ADITG/SW2)
I had such a discussion earlier with some of the systemd guys. My
intention was to introduce an additional unit for module loading for
exactly the reason you mentioned. The following (reasonable) outcome
- It is dangerous to load kernel modules from PID 1 since module
loading can get stuck
- Since modules are actually loaded with the thread that calls the
syscall, systemd would need additional threads
- Multi Threading is not really aimed in systemd for stability
reasons
Post by Hoyer, Marko (ADITG/SW2)
The probably safest way to do what you intended is to use an
additional process to load your modules, which could be easily done
by
Post by Hoyer, Marko (ADITG/SW2)
using ExecStartPre= in a service file. We are doing it exactly this
way not with kmod but with a tool that loads modules in parallel.
I'd be willing to merge a good patch that beefs up systemd-modules-load
to load the specified modules in parallel, with one thread for each.
Ok, I'll take this into my company. I've to find out if and how it is possible to cut out pieces from our software and to provide a patch for systemd-modules-load. Maybe we could go open source with the complete tool. I've to find out ...
We already have a very limited number of threaded bits in systemd, and
I figure out would be OK to do that for this too.
Please keep the threading minimal though, i.e. one kmod context per
thread, so that we need no synchronization and no locking. One thread
per module, i.e. no worker thread logic with thread reusing. also,
please set a thred name, so that hanging module loading only hang one
specific thread and the backtrace shows which module is at fault.
- we are actually using one kmod context for all modules in all threads
- the main thread (kind of control thread) is doing most of the stuff with the context (querying, resolving dependencies, ...)
- The worker threads are doing only one call: kmod_module_insert_module(kmodule->mod,0,kmodule->mod_params)
- I took a look into the implementation of this function and did not find so much dangerous stuff in it
- If it would make feel people safer, I could create different contexts for the worker threads
- and actually, we are using a fixed pool of worker threads receiving jobs from the main thread ;) Synchronization is done via eventfd.

I guess it would be best if we would take a look into our stuff once we are sure that we are able to contribute the stuff and that this stuff is actually wanted. We can then surely do some rework based on your experience.
Lennart
--
Lennart Poettering, Red Hat
Best regards

Marko Hoyer
Software Group II (ADITG/SW2)

Tel. +49 5121 49 6948
Alison Chaiken
2014-12-24 06:00:50 UTC
Permalink
Are you talking about "Save To RAM", "Save to Disk", or a hybrid combination of both? Or do you have something
completely different in mind?
A number of devices in the past have done a "save system image" to flash, and then when starting up, just load the
system image into memory and jump into it, everything up and running with no "startup time" needed other than the
initial memory load.
Not all processors currently support this behavior. See Russ Dill's
talk at 2013 ELC,

"Extending the swsusp Hibernation Framework to ARM,"
http://elinux.org/images/0/0c/Slides.pdf

or, put differently, on x86 3.16,
$# cat /sys/power/state
freeze standby mem disk

On Cortex-A9 3.14:
$# cat /sys/power/state
freeze mem

Dill's work added hibernation support for AM33xx. My understanding
of his presentation is that hibernation is not fully implemented for
other ARM processors.

On another topic that came up in this thread, why does
systemd-udev-settle.service exist? Doesn't the execution of this
service imply a synchronization point, and doesn't systemd create
targets rather than services for this purpose? Wouldn't
systemd-udev-settle.target make more sense then?
Post by Tom Gundersen
In a stock Fedora/Arch (and probably others, but didn't check)
systemd-modules-load is not used at all.
[ . . . ]
Post by Tom Gundersen
I'm not aware of any kernel modules that legitimately needs to be
loaded in this way (i.e., all the ones that do can/should be fixed).
On my Debian Testing system, I see fuse, loop, lp, ppdev and
parport_pc. The last 3 are related to printing, and presumably must
be preloaded because some printers will not usefully identify
themselves when powered on. Giving unsophisticated users access to a
wide variety of hotplugged devices is undoubtedly the main reasons
distros want to use systemd-modules-load.
We are not using systemd-modules-load for applying this approach since it is trying to load all modules in one shot.
Can systemd units list kernel modules as explicit dependencies? If
so, systemd's usual methods for ordering the start of units can
influence the loading order of modules.
- we have heavy graphics drivers (~800kb, stripped), they are needed half the way at startup
- video processing unit drivers (don't know the size), they are needed half the way at startup
- wireless & bluetooth, they are needed very late
- usb subsystem, conventionally needed very late (but this finally depends on the concrete product)
- hot plug mass storage handling, conventionally needed very late (but this finally depends on the concrete product)
- audio driver, in most of our products needed very late
- some drivers for INC communication (partly needed very early -> we compiled in them, partly needed later -> we have them as module)
Consider that wireless, bluetooth, audio and hotplug mass storage have
the modules on which they rely as systemd Requisites in their unit
files. We put the units for theseservices into a connectivity.target
that comes After a render.target that the graphics, video and INC are
in. render.target then has as Requisites the GPU, VPU and INC
modules. When each of these targets is started, the units could
insmod the modules and just skip udev rules altogether. These
dependencies won't prevent the kernel from trying to load the later
modules sooner, but insmod'ing earlier needed modules explicitly will
still influence the order.

-- Alison Chaiken,
Mentor Graphics
--
Alison Chaiken ***@she-devel.com
650-279-5600
http://{she-devel.com,exerciseforthereader.org}
Never underestimate the cleverness of advertisers, or mischief makers,
or criminals. -- Don Norman
Lucas De Marchi
2014-12-24 11:58:10 UTC
Permalink
Post by Alison Chaiken
Post by Tom Gundersen
In a stock Fedora/Arch (and probably others, but didn't check)
systemd-modules-load is not used at all.
[ . . . ]
Post by Tom Gundersen
I'm not aware of any kernel modules that legitimately needs to be
loaded in this way (i.e., all the ones that do can/should be fixed).
On my Debian Testing system, I see fuse, loop, lp, ppdev and
parport_pc. The last 3 are related to printing, and presumably must
be preloaded because some printers will not usefully identify
themselves when powered on. Giving unsophisticated users access to a
wide variety of hotplugged devices is undoubtedly the main reasons
distros want to use systemd-modules-load.
fuse and loop are the perfect examples of modules that should not be
in modules-load.d. Take a look in the output of 'kmod static-nodes'.
All these dead nodes will be created by systemd during startup, but
the module will only be loaded by the kernel when some one actually
try to use them.

$ ls /dev/loop-control
/dev/loop-control
$ lsmod | grep loop
$ touch /dev/loop-control
$ lsmod | grep loop
loop 26560 0
--
Lucas De Marchi
Michael Biebl
2014-12-25 00:31:06 UTC
Permalink
Post by Lucas De Marchi
Post by Alison Chaiken
On my Debian Testing system, I see fuse, loop, lp, ppdev and
parport_pc. The last 3 are related to printing, and presumably must
..
Post by Lucas De Marchi
fuse and loop are the perfect examples of modules that should not be
in modules-load.d. Take a look in the output of 'kmod static-nodes'.
All these dead nodes will be created by systemd during startup, but
the module will only be loaded by the kernel when some one actually
try to use them.
Those entries were most likely created by older versions of
debian-installer. The current version of debian-installer for
debian-installer should no longer add fuse or loop to /etc/modules.

Allison, if the /etc/modules entry was created by the (beta) jessie
d-i installer, please do file a bug report against the
debian-installer package.
--
Why is it that all of the instruments seeking intelligent life in the
universe are pointed away from Earth?
Michael Biebl
2014-12-25 01:06:03 UTC
Permalink
Post by Michael Biebl
Those entries were most likely created by older versions of
debian-installer. The current version of debian-installer for
debian-installer should no longer add fuse or loop to /etc/modules.
[ .. ]
fuse: /lib/modules-load.d
fuse: /lib/modules-load.d/fuse.conf
Package: fuse
Source: fuse (2.9.3-15)
Version: 2.9.3-15+b1
The /lib/modules-load.d/fuse.conf file should imo be dropped from the
fuse package, i.e. this looks like a genuine bug to me worth filing a
report.
Can't remember when I installed this system: a year ago maybe? But the
installer was the one from the previous release.
The date of the modules.conf and fuse.conf files is relatively recent,
though. Do you consider the appearance of these a bug?
The "loop" entry in /etc/modules, if still created by a recent
debian-installer, i'd consider a bug. If that happens, please do file
a bug.

If you installed the system with an older version of debian-installer,
you should clean up /etc/modules manually.

Michael

[1] https://www.debian.org/devel/debian-installer/
--
Why is it that all of the instruments seeking intelligent life in the
universe are pointed away from Earth?
Loading...