Discussion:
[RFC PATCH 0/9] dt: dependencies (for deterministic driver initialization order based on the DT)
Alexander Holler
2014-05-12 16:47:51 UTC
Permalink
Hello,

if I would have to describe the Linux kernels init system (before userspace
starts), it would be like:
Unknown functions with almost unknown functionality are called in an almost
random order.

That reminded me that a kernel-maintainer once said to me:
"We should aim to make things synchronous, deterministic and
stuff-happens-in-the-correct-order."

Looks like either the target moved or no Wilhelm Tell was around. ;)

This is an attempt to reach the target for the case of (platform-)drivers.
It is a mere starting point to reach the final target but it works on two
DT enabled ARM devices I have and it hasn't any implications on other
architectures, platforms or whatever. If the new configuration option,
which is only available if DT is enabled, isn't turned on, there is no
increase of code size or similiar.

So what are these patches I'm posting here?
They offer an imho solid base to fix the 3. problem. they build a deterministic
order in which (platform-)drivers should be initialized, based on datas
(dependencies) found in the device tree. They also offer a starting point to fix
the other 2 problems (unknown functions and unknown functionality) by showing a
way how the long range target of known functions with known functionality could
be reached.

Unfortunately work still isn't done. As written above, this is just a starting
point, neiter complete nor perfect. It is what I could do behind closed doors,
by spending a limited amount of time and resources (I've started to look at
that stuff 3-4 weeks ago, sometimes after 3.14 appeared), so it can be blamed
quick & dirty. But it should be already enough to explain and test the concepts.

Enough forewords.

This is a small patch series to use a deterministic dependency based device
and driver initialization order on machines which are using device tree.
The dependency graph will not only be build based on the device tree itself,
but will use dependencies based on phandle references in the .dts,
automatically added by dtc through a new property.
Manualy adding dependencies to the .dts is possible too.

Advantages:

- Correct order of initialization without any "dirty tricks" in drivers or the
machine init code. The order in which devices/drivers will be initialized
depends only on the DT and the topological sort algorithm used by the
kernel, not some code elsewhere. That means less code and more homogeneity
across different SOCs.
- Might be(come) a little faster because the number of deferred probes should
be minimized (they might not even be necessary anymore at all).
- Using a modified algorithm, it's possible to build a list of drivers which
can be initialized in parallel, e.g. using n threads, where n equals the
number of cores. I have not tested nor implemented it, because I don't have
any multicore DT based board with which I easily can use a (patched) mainline
kernel, just locked down multicore things I don't want to go through the
pain of unlocking them.
- Funny dependency graphs when using Graphviz.

Disadvantages:

- To use this feature correctly, binary blobs must be regenerated by
recompiling them to include properties with dependencies. Without
recompiling them there will be no advantage.
- Binary blobs will be slightly larger (some kb, see numbers below).
- Many .dts might need manual work to add some dependencies. Also most
dependencies will be discovered automatically (based on phandle references
in the .dts, some devices might need additional dependencies.


Some numbers (using kernel 3.14.3):

Dockstar (Kirkwood):
Works out of the box.
Size of dtb without dependencies: 9166
Size of dtb with dependencies: 9579
Graph with 34 nodes, 56 edges
Boot to panic (no root) no deps run 1-4:
1.325474 1.325458 1.325449 1.325494
Boot to panic (no root) deps run 1-4:
4.509989 4.484608 4.316221 4.485310
The large difference in time isn't because of the depency based
init but because ehci detected the connected hd before the panic
occured when deps were enabled. Withoout deps, the panic
already happend without any discovered usb-storage. I haven't
checked why.
The actual times to boot from USB-HD are 3.417248 without
dependencies versus 5.618293 with. I still have to check where
the difference of more than a second does come from, a difference
like on the BBB (see below) should be more reasonable.

BeagleBoneBlack A5C (AM3359):
Had to disable ethernet (driver crashes).
Size of dtb without dependencies: 31379
Size of dtb with dependencies: 33300
Graph with 145 nodes, 266 edges
Boot to panic (no root) no deps run 1-4:
1.229431 1.229516 1.229509 1.229557
Boot to panic (no root) deps run 1-4:
1.361780 1.361442 1.361532 1.361699

BeagleBoard C4 (OMAP34xx):
Had to disable usb (driver crashes) and several other problems,
but an unpatched 3.14.y doesn't really work too (which was the
reason I looked at what happes and did these patches).
Size of dtb without dependencies: 57003
Size of dtb with dependencies: 62580
Graph with 390 nodes, 848 edges
Boot to panic (no root) no deps run 1-4:
3.386535 3.343566 3.381469 3.357208
Boot to panic (no root) deps run 1-4:
5.961425 5.907714 6.053680 5.957855

The difference in boot time is mainly based on the function which
matches drivers to devices based on the compatible string. This
is currently not solved very elegant and walks through multiple
list multiple times. The actual sorting is very fast (just some ms).

For people which do like pictures, I've put the dependency graph for
the Dockstar online: http://ahsoftware.de/dt-kirkwood-dockstar.svg
And a far more complicated dependency graph for the BeagleBoard:
http://ahsoftware.de/dt-omap3-beagle.svg

These pictures makes it easy to see what this feature is about. All the cyan
arrows are the new dependencies, the black ones are the dependecies as
currently used (for device but not driver initialization). So you see, there
is quiet a difference.
If I'm right, those pictures also visualize which drivers could be initialized
in parallel (I haven't checked it, I would have to read the Graphviz
documentation or check the whole graph to be sure). But on a first look at it,
it looks like dot (Graphviz) placed all the nodes which can be initialized in
parallel on the same row. So there are quiet some drivers which could be
initialized in parallel, taking advantage of multiple cores to reduce boot time.

(Just in case you wonder what the first number in the nodes in the pictures
might be, it is the phandle of the device tree node and is usually generated
automatically by dtc and might change when the .dts changes. This number
doesn't have any meaning in regard to the initialization order.)

What follows are the patches I've done, ready to test this feature. These are

- 3 patches for the in-kernel dtc (to add dependencies to the dtb),
- 1 patch for the kernel to use these dependencies to build the initialization
order and create devices,
- 1 patch to register drivers based on the built order,
- 1 patch to show which changes are necessary for some drivers,
- 2 patches to use this feature on kirkwood (very good as small example),
- 1 patch to use this feature on omap2+ (and am33xx),

These patches are based on 3.14.3 and unfortunately don't apply cleanly to
3.15-rcN. But I'm currently too lazy to rebase them as I usually use a stable
kernel to test things I change. And as this is just a RFC, please use 3.14.x
to test these patches.

All patches do explain further what and how they do. And if I get the OK
that they will be merged (after any necessary clean up), I would write
a document for Documentation too (if really wanted, I assume you already have
noticed that I'm not a native english speaker/writer).

My suggestion to continue:

- Merge the first 5 patches (maybe after they got cleaned up). So they won't
disappear and people will find a starting point in the kernel to continue
work on. They don't do any harm and don't increase codesize if the new
kernel option they introduce is disabled. It also might be a good idea to
merge them in order to get the new dependencies into binary DT blobs
as early as possible, even if it might still need some time until they
really will be used.

- Have a look at the other patches. Especially the one for the Kirkwood which
changes the initializazion order by just adding one dependency (ehci vs.
regulator) to the .dts. This shows how such could be done without any changes
on the drivers source code.


If you ask why I did those patches: For the same reason a mountain climber
does climb a mountain. That also explains my limited motivation and
resources. ;)


Regards,

Alexander Holler


LKML-disclaimer (unfortunately necessary):
Please keep away with comments about style, typos or spelling errors in
comments and similiar silly stuff if you have nothing else to say.
Feel free to comment functionality or real errors, but not style, form or
other bureaucrazy things.
And please keep in mind that I'm neiter your intern, your student, your pupil,
nor your child.
Alexander Holler
2014-05-12 16:47:52 UTC
Permalink
During the step from .dts to .dtb the information about dependcies contained
in the .dts through phandle references is lost. This makes it impossible to
use the binary blob to create a dependency graph without knowing the semantic
of all cell arrays.

Therefor automatically add a new property called 'dependencies' to all nodes
which have phandle references in one of their properties.

This new property will contain an array of phandles with one value for every
phandle referenced by other properties in the node.

If such a property already exists (e.g. to manually add dependencies through
the .dts), the existing list will be expanded.

Added phandles will be the phandle of either the referenced node itself (if
it has a property named 'compatible', or of the next parent of the referenced
node which as property named 'compatible'. This ensures only dependencies to
drivers will be added.

References to phandles of parent or child nodes will not be added to this
property, because this information is already contained in the blob (in the
form of the tree itself).

No dependencies to disabled nodes will be added.

Signed-off-by: Alexander Holler <holler-SXC+***@public.gmane.org>
---
scripts/dtc/Makefile | 3 +-
scripts/dtc/Makefile.dtc | 1 +
scripts/dtc/dependencies.c | 108 +++++++++++++++++++++++++++++++++++++++++++++
scripts/dtc/dtc.c | 12 ++++-
scripts/dtc/dtc.h | 3 ++
5 files changed, 125 insertions(+), 2 deletions(-)
create mode 100644 scripts/dtc/dependencies.c

diff --git a/scripts/dtc/Makefile b/scripts/dtc/Makefile
index 2a48022..1174cf9 100644
--- a/scripts/dtc/Makefile
+++ b/scripts/dtc/Makefile
@@ -4,7 +4,7 @@ hostprogs-y := dtc
always := $(hostprogs-y)

dtc-objs := dtc.o flattree.o fstree.o data.o livetree.o treesource.o \
- srcpos.o checks.o util.o
+ srcpos.o checks.o util.o dependencies.o
dtc-objs += dtc-lexer.lex.o dtc-parser.tab.o

# Source files need to get at the userspace version of libfdt_env.h to compile
@@ -13,6 +13,7 @@ HOSTCFLAGS_DTC := -I$(src) -I$(src)/libfdt

HOSTCFLAGS_checks.o := $(HOSTCFLAGS_DTC)
HOSTCFLAGS_data.o := $(HOSTCFLAGS_DTC)
+HOSTCFLAGS_dependencies.o := $(HOSTCFLAGS_DTC)
HOSTCFLAGS_dtc.o := $(HOSTCFLAGS_DTC)
HOSTCFLAGS_flattree.o := $(HOSTCFLAGS_DTC)
HOSTCFLAGS_fstree.o := $(HOSTCFLAGS_DTC)
diff --git a/scripts/dtc/Makefile.dtc b/scripts/dtc/Makefile.dtc
index bece49b..5fb5343 100644
--- a/scripts/dtc/Makefile.dtc
+++ b/scripts/dtc/Makefile.dtc
@@ -6,6 +6,7 @@
DTC_SRCS = \
checks.c \
data.c \
+ dependencies.c \
dtc.c \
flattree.c \
fstree.c \
diff --git a/scripts/dtc/dependencies.c b/scripts/dtc/dependencies.c
new file mode 100644
index 0000000..dd4658c
--- /dev/null
+++ b/scripts/dtc/dependencies.c
@@ -0,0 +1,108 @@
+/*
+ * Code to add a property which contains dependencies (used phandle references)
+ * to all (driver) nodes which are having phandle references.
+ *
+ * Copyright (C) 2014 Alexander Holler <holler-SXC+***@public.gmane.org>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#include <dtc.h>
+
+/* Searches upwards for a node with a property 'compatible' */
+static struct node *find_compatible_not_disabled(struct node *node)
+{
+ struct property *prop;
+
+ while (node) {
+ prop = get_property(node, "compatible");
+ if (prop) {
+ prop = get_property(node, "status");
+ if (prop)
+ if (!prop->val.len ||
+ (strcmp(prop->val.val, "okay") &&
+ strcmp(prop->val.val, "ok")))
+ return NULL; /* disabled */
+ return node;
+ }
+ node = node->parent;
+ }
+ return NULL;
+}
+
+static bool is_parent_of(struct node *node1, struct node *node2)
+{
+ while (node2) {
+ if (node2->parent == node1)
+ return true;
+ node2 = node2->parent;
+ }
+ return false;
+
+}
+
+static void add_deps(struct node *dt, struct node *node, struct property *prop)
+{
+ struct marker *m = prop->val.markers;
+ struct node *refnode;
+ cell_t phandle;
+ struct property *prop_deps;
+ unsigned i;
+ cell_t *cell;
+ struct node *source;
+ struct node *target;
+
+ for_each_marker_of_type(m, REF_PHANDLE) {
+ assert(m->offset + sizeof(cell_t) <= prop->val.len);
+
+ refnode = get_node_by_ref(dt, m->ref);
+ if (!refnode) {
+ fprintf(stderr,
+ "ERROR: Reference to non-existent node or label \"%s\"\n",
+ m->ref);
+ continue;
+ }
+
+ source = find_compatible_not_disabled(node);
+ target = find_compatible_not_disabled(refnode);
+ if (!source || !target || source == target ||
+ is_parent_of(source, target) ||
+ is_parent_of(target, source))
+ continue;
+ phandle = get_node_phandle(dt, target);
+ prop_deps = get_property(source, "dependencies");
+ if (!prop_deps) {
+ add_property(source,
+ build_property("dependencies",
+ data_append_cell(empty_data, phandle)));
+ continue;
+ }
+ cell = (cell_t *)prop_deps->val.val;
+ for (i = 0; i < prop_deps->val.len/4; ++i)
+ if (*cell++ == cpu_to_fdt32(phandle))
+ break;
+ if (i < prop_deps->val.len/4)
+ continue; /* avoid duplicates */
+ prop_deps->val = data_append_cell(prop_deps->val, phandle);
+ }
+}
+
+static void process_nodes_props(struct node *dt, struct node *node)
+{
+ struct node *child;
+ struct property *prop;
+
+ for_each_property(node, prop)
+ add_deps(dt, node, prop);
+
+ for_each_child(node, child)
+ process_nodes_props(dt, child);
+}
+
+void add_dependencies(struct boot_info *bi)
+{
+ process_nodes_props(bi->dt, bi->dt);
+}
diff --git a/scripts/dtc/dtc.c b/scripts/dtc/dtc.c
index a375683..fbe49d9 100644
--- a/scripts/dtc/dtc.c
+++ b/scripts/dtc/dtc.c
@@ -86,6 +86,8 @@ static void __attribute__ ((noreturn)) usage(void)
fprintf(stderr, "\t\tAdd a path to search for include files\n");
fprintf(stderr, "\t-s\n");
fprintf(stderr, "\t\tSort nodes and properties before outputting (only useful for\n\t\tcomparing trees)\n");
+ fprintf(stderr, "\t-D\n");
+ fprintf(stderr, "\t\tDo not automatically add dependencies for phandle references\n");
fprintf(stderr, "\t-v\n");
fprintf(stderr, "\t\tPrint DTC version and exit\n");
fprintf(stderr, "\t-H <phandle format>\n");
@@ -107,6 +109,7 @@ int main(int argc, char *argv[])
const char *outname = "-";
const char *depname = NULL;
int force = 0, sort = 0;
+ int dependencies = 1;
const char *arg;
int opt;
FILE *outf = NULL;
@@ -118,7 +121,7 @@ int main(int argc, char *argv[])
minsize = 0;
padsize = 0;

- while ((opt = getopt(argc, argv, "hI:O:o:V:d:R:S:p:fqb:i:vH:sW:E:"))
+ while ((opt = getopt(argc, argv, "hI:O:o:V:d:R:S:p:fqb:i:vH:sDW:E:"))
!= EOF) {
switch (opt) {
case 'I':
@@ -176,6 +179,10 @@ int main(int argc, char *argv[])
sort = 1;
break;

+ case 'D':
+ dependencies = false;
+ break;
+
case 'W':
parse_checks_option(true, false, optarg);
break;
@@ -235,6 +242,9 @@ int main(int argc, char *argv[])
if (sort)
sort_tree(bi);

+ if (dependencies)
+ add_dependencies(bi);
+
if (streq(outname, "-")) {
outf = stdout;
} else {
diff --git a/scripts/dtc/dtc.h b/scripts/dtc/dtc.h
index 3e42a07..c3dbeac 100644
--- a/scripts/dtc/dtc.h
+++ b/scripts/dtc/dtc.h
@@ -267,4 +267,7 @@ struct boot_info *dt_from_source(const char *f);

struct boot_info *dt_from_fs(const char *dirname);

+/* Dependencies */
+void add_dependencies(struct boot_info *bi);
+
#endif /* _DTC_H */
--
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Tomasz Figa
2014-05-17 12:16:52 UTC
Permalink
Hi Alexander,
Post by Alexander Holler
During the step from .dts to .dtb the information about dependcies contained
in the .dts through phandle references is lost. This makes it impossible to
use the binary blob to create a dependency graph without knowing the semantic
of all cell arrays.
Therefor automatically add a new property called 'dependencies' to all nodes
which have phandle references in one of their properties.
This new property will contain an array of phandles with one value for every
phandle referenced by other properties in the node.
If such a property already exists (e.g. to manually add dependencies through
the .dts), the existing list will be expanded.
Added phandles will be the phandle of either the referenced node itself (if
it has a property named 'compatible', or of the next parent of the referenced
node which as property named 'compatible'. This ensures only dependencies to
drivers will be added.
Sounds good.
Post by Alexander Holler
References to phandles of parent or child nodes will not be added to this
property, because this information is already contained in the blob (in the
form of the tree itself).
I wonder if we shouldn't be including them too for consistency related
reasons, so we have all the necessary information in one place.
References to child nodes are great recipes for cycles, though...

No strong opinion, though, just an idea.
Post by Alexander Holler
No dependencies to disabled nodes will be added.
Same here. IMHO it might be wise to let the parsing entity (e.g. kernel)
decide whether to ignore a dependency to disabled node or not.

Otherwise, I like the simplicity of compile-time dependency list
creation. Quite a nice work.

Best regards,
Tomasz
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alexander Holler
2014-05-19 12:35:49 UTC
Permalink
Post by Tomasz Figa
Post by Alexander Holler
References to phandles of parent or child nodes will not be added to this
property, because this information is already contained in the blob (in the
form of the tree itself).
I wonder if we shouldn't be including them too for consistency related
reasons, so we have all the necessary information in one place.
References to child nodes are great recipes for cycles, though...
No strong opinion, though, just an idea.
As said, they are already in the tree itself. And they are already
included in the graph (these are the black edges), so they just don't
appear in the property dependencies.
Post by Tomasz Figa
Post by Alexander Holler
No dependencies to disabled nodes will be added.
Same here. IMHO it might be wise to let the parsing entity (e.g. kernel)
decide whether to ignore a dependency to disabled node or not.
Otherwise, I like the simplicity of compile-time dependency list
creation. Quite a nice work.
Thanks.

What's still questionable about the patches for dtc is if dependencies
to devices and not just drivers should be included in the new property
dependencies too. My current assumption is that all devices belonging to
one and the same driver don't have dependencies between each other. In
other words the order in which devices will be attached to one and the
same driver isn't important. If that assumption is correct it would be
possible to just attach all devices belonging to a driver after the
driver was loaded (also I haven't that done in my patches).

And thinking about that again, I think I was wrong and doing so have
been some kind of evil premature optimization I did in order to spare a
few dependencies/edges. But changing this can done by removing a few
lines in the code for dtc (patch 1).

Regards,

Alexander Holler
Alexander Holler
2014-05-19 17:26:28 UTC
Permalink
[ Crap. Sorry about the duplicate post. Stupid HTML; didn't hit the
lists. -- jdl ]
What's still questionable about the patches for dtc is if dependencies to
devices and not just drivers should be included in the new property
dependencies too.
I don't think the DTC should have any semantic knowledge of why these
dependency arcs are being added to the graph. Sure, it could be that
different types of arcs are added, and that the total dependency graph
travels multiple such arc types to obtains some valid topological sort,
but the DTC itself should just not care.
I will remove those policies (which means including all dependencies).
As said below, I already thought it was evil premature optimization (I
did in order to make the graph a bit smaller and to save some bytes).
(No date, it isn't a paid project so I will do it whenever I feel good
to do so).
After saying that, there are likely semantic checks that could be added to
ensure some policy about those arcs was followed. Separate the
implementation from the policy. There is already plenty of discussion
down that line within the DTC ongoing.
Hmm, discussion about what? Those dependencies or about semantic checks?

Btw., if someone has a problem with the necessary time to do the
topological sort at boot time (needs a few ms on a single core omap with
600 MHz), there could be an additional option to add a new property
which includes the whole (already topological sorted) list. That
wouldn't be much effort. But currently I don't think any DT enabled
device is in need of having to avoid doing the topological sort itself.

Regards,

Alexander Holler
HTH,
jdl
References to phandles of parent or child nodes will not be added to this
Post by Tomasz Figa
Post by Alexander Holler
property, because this information is already contained in the blob (in the
form of the tree itself).
I wonder if we shouldn't be including them too for consistency related
reasons, so we have all the necessary information in one place.
References to child nodes are great recipes for cycles, though...
No strong opinion, though, just an idea.
As said, they are already in the tree itself. And they are already
included in the graph (these are the black edges), so they just don't
appear in the property dependencies.
Post by Tomasz Figa
Post by Alexander Holler
No dependencies to disabled nodes will be added.
Same here. IMHO it might be wise to let the parsing entity (e.g. kernel)
decide whether to ignore a dependency to disabled node or not.
Otherwise, I like the simplicity of compile-time dependency list
creation. Quite a nice work.
Thanks.
What's still questionable about the patches for dtc is if dependencies to
devices and not just drivers should be included in the new property
dependencies too. My current assumption is that all devices belonging to
one and the same driver don't have dependencies between each other. In
other words the order in which devices will be attached to one and the same
driver isn't important. If that assumption is correct it would be possible
to just attach all devices belonging to a driver after the driver was
loaded (also I haven't that done in my patches).
And thinking about that again, I think I was wrong and doing so have been
some kind of evil premature optimization I did in order to spare a few
dependencies/edges. But changing this can done by removing a few lines in
the code for dtc (patch 1).
Regards,
Alexander Holler
_______________________________________________
linux-arm-kernel mailing list
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Grant Likely
2014-05-27 20:02:25 UTC
Permalink
Post by Alexander Holler
Post by Tomasz Figa
Post by Alexander Holler
References to phandles of parent or child nodes will not be added to this
property, because this information is already contained in the blob (in the
form of the tree itself).
I wonder if we shouldn't be including them too for consistency related
reasons, so we have all the necessary information in one place.
References to child nodes are great recipes for cycles, though...
No strong opinion, though, just an idea.
As said, they are already in the tree itself. And they are already
included in the graph (these are the black edges), so they just don't
appear in the property dependencies.
Post by Tomasz Figa
Post by Alexander Holler
No dependencies to disabled nodes will be added.
Same here. IMHO it might be wise to let the parsing entity (e.g. kernel)
decide whether to ignore a dependency to disabled node or not.
Otherwise, I like the simplicity of compile-time dependency list
creation. Quite a nice work.
Thanks.
What's still questionable about the patches for dtc is if dependencies
to devices and not just drivers should be included in the new property
dependencies too. My current assumption is that all devices belonging to
one and the same driver don't have dependencies between each other. In
other words the order in which devices will be attached to one and the
same driver isn't important. If that assumption is correct it would be
possible to just attach all devices belonging to a driver after the
driver was loaded (also I haven't that done in my patches).
There aren't really any guarantees here. It is perfectly valid to have
two of the same device depending on the other, or even a device with a
different driver between the two.

There's always going to be corner cases on the dependency chain. The
question is whether or not it is worth trying to solve every concievable
order, or if a partway solution is good enough.

g.
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alexander Holler
2014-05-27 20:31:08 UTC
Permalink
Post by Grant Likely
Post by Alexander Holler
What's still questionable about the patches for dtc is if dependencies
to devices and not just drivers should be included in the new property
dependencies too. My current assumption is that all devices belonging to
one and the same driver don't have dependencies between each other. In
other words the order in which devices will be attached to one and the
same driver isn't important. If that assumption is correct it would be
possible to just attach all devices belonging to a driver after the
driver was loaded (also I haven't that done in my patches).
There aren't really any guarantees here. It is perfectly valid to have
two of the same device depending on the other, or even a device with a
different driver between the two.
There's always going to be corner cases on the dependency chain. The
question is whether or not it is worth trying to solve every concievable
order, or if a partway solution is good enough.
Solving dependencies always happens automatically, with or without
dependencies inbetween devices. I just ignored dependencies between pure
devices (instead changed them into dependencies between drivers) because
I'm still not sure how to handle devices at all. Below is a diff ontop
my dtc-patches to include dependencies between devices too. As said, the
changes to do so are minimal. Of course, the graphs are a bit more
complex, because they then include devices too, but that isn't any
problem for the solving algorithm at all.


diff --git a/scripts/dtc/dependencies.c b/scripts/dtc/dependencies.c
index 06f447b..602ec01 100644
--- a/scripts/dtc/dependencies.c
+++ b/scripts/dtc/dependencies.c
@@ -66,8 +66,10 @@ static void add_deps(struct node *dt, struct node
*node, struct property *prop)
continue;
}

- source = find_compatible_not_disabled(node);
- target = find_compatible_not_disabled(refnode);
+ //source = find_compatible_not_disabled(node);
+ //target = find_compatible_not_disabled(refnode);
+ source = node;
+ target = refnode;
if (!source || !target || source == target ||
is_parent_of(source, target) ||
is_parent_of(target, source))
@@ -385,9 +387,9 @@ static int __init add_deps_lnx(struct device_node
*parent,

if (!__of_device_is_available(node))
return 0;
- if (__of_get_property(node, "compatible", NULL)) {
+// if (__of_get_property(node, "compatible", NULL)) {
if (!parent->phandle) {
- if (__of_get_property(parent, "compatible", NULL))
+// if (__of_get_property(parent, "compatible", NULL))
parent->phandle = 1 + order.max_phandle++;
}
if (!node->phandle)
@@ -425,7 +427,7 @@ static int __init add_deps_lnx(struct device_node
*parent,
if (unlikely(rc))
return rc;
parent = node; /* change the parent only if node is a driver */
- }
+// }
for_each_child_of_node(node, child) {
rc = add_deps_lnx(parent, child, print_dot);
if (unlikely(rc))
--
1.8.3.2



To make it easier to see devices in the produced order, here is another
patch on top:



@@ -464,6 +467,8 @@ void __init of_init_print_order(const char *name)
if (order.order[i]->full_name)
pr_cont(" (%s)", order.order[i]->full_name);
prop = get_property(order.order[i], "compatible");
+ if (!prop)
+ pr_cont(" -");
for (cp = of_prop_next_string(prop, NULL); cp;
cp = of_prop_next_string(prop, cp))
pr_cont(" %s", cp);



With that patch one can do e.g. dtc -t | grep ' - ' to see which device
nodes are included which don't have a compatible property. For the
omap3-beagle this produces the following 21 additional entries in the
init-order:

***@laptopahbt ~/Source/aholler/dtc.git $ dts/make_dtb.sh
dts/omap3-beagle.dts -t | grep ' - '
init 4 0xe4 pinmux_twl4030_pins
(/ocp/***@48002030/pinmux_twl4030_pins) - (parent 0x14b)
init 10 0x107 clocks (/ocp/***@48004000/clocks) - (parent 0x106)
init 17 0x101 clocks (/ocp/***@48306000/clocks) - (parent 0x100)
init 226 0x143 clocks (/ocp/***@48002000/clocks) - (parent 0x142)
init 237 0xe1 pinmux_hsusb2_pins
(/ocp/***@48002030/pinmux_hsusb2_pins) - (parent 0x14b)
init 239 0xe2 pinmux_gpio1_pins (/ocp/***@48002a00/pinmux_gpio1_pins)
- (parent 0x14d)
init 251 0xec pinmux_hsusb2_2_pins
(/ocp/***@480025d8/pinmux_hsusb2_2_pins) - (parent 0x18e)
init 255 0xf4 chosen (/chosen) - (parent 0xf3)
init 256 0xf5 aliases (/aliases) - (parent 0xf3)
init 257 0xf6 memory (/memory) - (parent 0xf3)
init 258 0xf7 cpus (/cpus) - (parent 0xf3)
init 269 0x105 clockdomains (/ocp/***@48306000/clockdomains) - (parent
0x100)
init 311 0x131 clockdomains (/ocp/***@48004000/clockdomains) - (parent 0x106)
init 333 0x149 clockdomains (/ocp/***@48002000/clockdomains) - (parent
0x142)
init 335 0x14c pinmux_uart3_pins
(/ocp/***@48002030/pinmux_uart3_pins) - (parent 0x14b)
init 343 0x157 codec (/ocp/***@48070000/***@48/audio/codec) - (parent 0xf1)
init 398 0x18f choosen (/choosen) - (parent 0xf3)
init 400 0x191 pmu_stat (/leds/pmu_stat) - (parent 0x190)
init 401 0x192 heartbeat (/leds/heartbeat) - (parent 0x190)
init 402 0x193 mmc (/leds/mmc) - (parent 0x190)
init 405 0x196 user (/gpio_keys/user) - (parent 0x195)


Regards,

Alexander Holler
Alexander Holler
2014-05-12 16:47:53 UTC
Permalink
Use the properties named 'dependencies' in binary device tree blobs to build
a dependency based initialization order for platform devices and drivers.

This is done by building a directed acyclic graph using an adjacency list
and doing a topological sort to retrieve the order in which devices/drivers
should be created/initialized.

Signed-off-by: Alexander Holler <***@ahsoftware.de>
---
arch/arm/kernel/setup.c | 20 +-
drivers/of/Kconfig | 10 +
drivers/of/Makefile | 1 +
drivers/of/of_dependencies.c | 403 ++++++++++++++++++++++++++++++++++++++++
drivers/of/platform.c | 32 +++-
include/linux/of.h | 15 ++
include/linux/of_dependencies.h | 61 ++++++
include/linux/of_platform.h | 5 +
8 files changed, 543 insertions(+), 4 deletions(-)
create mode 100644 drivers/of/of_dependencies.c
create mode 100644 include/linux/of_dependencies.h

diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 1e8b030..f67387d 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -19,6 +19,7 @@
#include <linux/seq_file.h>
#include <linux/screen_info.h>
#include <linux/of_platform.h>
+#include <linux/of_dependencies.h>
#include <linux/init.h>
#include <linux/kexec.h>
#include <linux/of_fdt.h>
@@ -787,10 +788,19 @@ static int __init customize_machine(void)
if (machine_desc->init_machine)
machine_desc->init_machine();
#ifdef CONFIG_OF
- else
+ else {
+#ifdef CONFIG_OF_DEPENDENCIES
+ if (!of_init_build_order(NULL, NULL))
+ of_init_create_devices(NULL, NULL);
+ else
+ of_init_free_order();
+#else
of_platform_populate(NULL, of_default_bus_match_table,
NULL, NULL);
#endif
+ }
+#endif
+
return 0;
}
arch_initcall(customize_machine);
@@ -914,7 +924,13 @@ void __init setup_arch(char **cmdline_p)
arm_pm_restart = mdesc->restart;

unflatten_device_tree();
-
+#ifdef CONFIG_OF_DEPENDENCIES
+ /*
+ * No alloc used in of_init_build_order(), therefor it would work
+ * already here too.
+ */
+ /* of_init_build_order(NULL, NULL); */
+#endif
arm_dt_init_cpu_maps();
psci_init();
#ifdef CONFIG_SMP
diff --git a/drivers/of/Kconfig b/drivers/of/Kconfig
index c6973f1..a7e1614 100644
--- a/drivers/of/Kconfig
+++ b/drivers/of/Kconfig
@@ -75,4 +75,14 @@ config OF_MTD
depends on MTD
def_bool y

+config OF_DEPENDENCIES
+ bool "Device Tree dependency based initialization order (EXPERIMENTAL)"
+ help
+ Enables dependency based initialization order of platform drivers.
+
+ For correct operation the binary DT blob should have been
+ populated with properties of type "dependencies".
+
+ If unsure, say N here.
+
endmenu # OF
diff --git a/drivers/of/Makefile b/drivers/of/Makefile
index efd0510..3888d9c 100644
--- a/drivers/of/Makefile
+++ b/drivers/of/Makefile
@@ -9,3 +9,4 @@ obj-$(CONFIG_OF_MDIO) += of_mdio.o
obj-$(CONFIG_OF_PCI) += of_pci.o
obj-$(CONFIG_OF_PCI_IRQ) += of_pci_irq.o
obj-$(CONFIG_OF_MTD) += of_mtd.o
+obj-$(CONFIG_OF_DEPENDENCIES) += of_dependencies.o
diff --git a/drivers/of/of_dependencies.c b/drivers/of/of_dependencies.c
new file mode 100644
index 0000000..7905172
--- /dev/null
+++ b/drivers/of/of_dependencies.c
@@ -0,0 +1,403 @@
+/*
+ * Code for building a deterministic initialization order based on dependencies
+ * defined in the device tree.
+ *
+ * Copyright (C) 2014 Alexander Holler <***@ahsoftware.de>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#include <linux/of_dependencies.h>
+
+#define MAX_DT_NODES 1000 /* maximum number of vertices */
+#define MAX_EDGES (MAX_DT_NODES*2) /* maximum number of edges (dependencies) */
+
+struct edgenode {
+ uint32_t y; /* phandle */
+ struct edgenode *next; /* next edge in list */
+};
+
+/*
+ * Vertex numbers do correspond to phandle numbers. That means the graph
+ * does contain as much vertices as the maximum of all phandles.
+ * Or in other words, we assume that for all phandles in the device tree
+ * 0 < phandle < MAX_DT_NODES+1 is true.
+ */
+struct dep_graph {
+ struct edgenode edge_slots[MAX_EDGES]; /* used to avoid kmalloc */
+ struct edgenode *edges[MAX_DT_NODES+1]; /* adjacency info */
+ unsigned nvertices; /* number of vertices in graph */
+ unsigned nedges; /* number of edges in graph */
+ bool processed[MAX_DT_NODES+1]; /* which vertices have been processed */
+ bool include_node[MAX_DT_NODES+1]; /* which nodes to consider */
+ bool discovered[MAX_DT_NODES+1]; /* which vertices have been found */
+ bool finished; /* if true, cut off search immediately */
+};
+static struct dep_graph graph __initdata;
+
+struct init_order {
+ uint32_t max_phandle; /* the max used phandle */
+ uint32_t old_max_phandle; /* used to keep track of added phandles */
+ struct device_node *order[MAX_DT_NODES+1];
+ unsigned count;
+ /* Used to keep track of parent devices in regard to the DT */
+ uint32_t parent_by_phandle[MAX_DT_NODES+1];
+ struct device *device_by_phandle[MAX_DT_NODES+1];
+};
+static struct init_order order __initdata;
+
+
+/* Copied from drivers/of/base.c (because it's lockless). */
+static struct property * __init __of_find_property(const struct device_node *np,
+ const char *name, int *lenp)
+{
+ struct property *pp;
+
+ if (!np)
+ return NULL;
+
+ for (pp = np->properties; pp; pp = pp->next) {
+ if (of_prop_cmp(pp->name, name) == 0) {
+ if (lenp)
+ *lenp = pp->length;
+ break;
+ }
+ }
+
+ return pp;
+}
+
+/* Copied from drivers/of/base.c (because it's lockless). */
+static const void * __init __of_get_property(const struct device_node *np,
+ const char *name, int *lenp)
+{
+ struct property *pp = __of_find_property(np, name, lenp);
+
+ return pp ? pp->value : NULL;
+}
+
+/* Copied from drivers/of/base.c (because it's lockless). */
+static int __init __of_device_is_available(const struct device_node *device)
+{
+ const char *status;
+ int statlen;
+
+ if (!device)
+ return 0;
+
+ status = __of_get_property(device, "status", &statlen);
+ if (status == NULL)
+ return 1;
+
+ if (statlen > 0) {
+ if (!strcmp(status, "okay") || !strcmp(status, "ok"))
+ return 1;
+ }
+
+ return 0;
+}
+
+/*
+ * x is a dependant of y or in other words
+ * y will be initialized before x.
+ */
+static int __init insert_edge(uint32_t x, uint32_t y)
+{
+ struct edgenode *p; /* temporary pointer */
+
+ if (unlikely(x > MAX_DT_NODES || y > MAX_DT_NODES)) {
+ pr_err("Node found with phandle 0x%x > MAX_DT_NODES (%d)!\n",
+ x > MAX_DT_NODES ? x : y, MAX_DT_NODES);
+ return -EINVAL;
+ }
+ if (unlikely(!x || !y))
+ return 0;
+ if (unlikely(graph.nedges >= MAX_EDGES)) {
+ pr_err("Maximum number of edges (%d) reached!\n", MAX_EDGES);
+ return -EINVAL;
+ }
+ p = &graph.edge_slots[graph.nedges++];
+ graph.include_node[x] = 1;
+ graph.include_node[y] = 1;
+ p->y = y;
+ p->next = graph.edges[x];
+ graph.edges[x] = p; /* insert at head of list */
+
+ graph.nvertices = (x > graph.nvertices) ? x : graph.nvertices;
+ graph.nvertices = (y > graph.nvertices) ? y : graph.nvertices;
+ return 0;
+}
+
+static void __init print_node_name(uint32_t v)
+{
+ struct device_node *node;
+
+ node = of_find_node_by_phandle(v);
+ if (!node) {
+ pr_cont("Node for phandle 0x%x not found", v);
+ return;
+ }
+ if (node->name)
+ pr_cont("%s", node->name);
+ if (node->full_name)
+ pr_cont(" (%s)", node->full_name);
+ of_node_put(node);
+}
+
+/*
+ * I would prefer to use the BGL (Boost Graph Library), but as I can't use it
+ * here (for obvious reasons), the next four functions below are based on
+ * code of Steven Skiena's book 'The Algorithm Design Manual'.
+ */
+
+static void __init process_edge(uint32_t x, uint32_t y)
+{
+ if (unlikely(graph.discovered[y] && !graph.processed[y])) {
+ pr_err("Cycle found 0x%x ", x);
+ print_node_name(x);
+ pr_cont(" <-> 0x%x ", y);
+ print_node_name(y);
+ pr_cont("!\n");
+ graph.finished = 1;
+ }
+}
+
+static void __init process_vertex_late(uint32_t v)
+{
+ struct device_node *node;
+
+ node = of_find_node_by_phandle(v);
+ if (!node) {
+ pr_err("No node for phandle 0x%x not found", v);
+ return;
+ }
+ order.order[order.count++] = node;
+}
+
+static void __init depth_first_search(uint32_t v)
+{
+ struct edgenode *p;
+ uint32_t y; /* successor vertex */
+
+ if (graph.finished)
+ return;
+ graph.discovered[v] = 1;
+ p = graph.edges[v];
+ while (p) {
+ y = p->y;
+ if (!graph.discovered[y]) {
+ process_edge(v, y);
+ depth_first_search(y);
+ } else
+ process_edge(v, y);
+ if (graph.finished)
+ return;
+ p = p->next;
+ }
+ process_vertex_late(v);
+ graph.processed[v] = 1;
+}
+
+static void __init topological_sort(void)
+{
+ unsigned i;
+
+ for (i = 1; i <= graph.nvertices; ++i)
+ if (!graph.discovered[i] && graph.include_node[i])
+ depth_first_search(i);
+}
+
+static int __init add_dep_list(struct device_node *node,
+ const struct of_device_id *matches)
+{
+ const __be32 *list, *list_end;
+ uint32_t ph;
+ int size;
+ int rc = 0;
+ struct device_node *dep;
+
+ list = __of_get_property(node, "dependencies", &size);
+ if (!list || !size || size%sizeof(*list))
+ return 0;
+ list_end = list + size / sizeof(*list);
+ while (list < list_end) {
+ ph = be32_to_cpup(list++);
+ if (unlikely(!ph)) {
+ /* Should never happen */
+ if (node->name)
+ pr_warn("phandle == 0 for %s\n", node->name);
+ continue;
+ }
+ dep = of_find_node_by_phandle(ph);
+ if (unlikely(!dep)) {
+ pr_err("No DT node for dependency with phandle 0x%x found\n",
+ ph);
+ continue;
+ }
+ if (unlikely(matches && !of_match_node(matches, dep)))
+ continue;
+ rc = insert_edge(node->phandle, ph);
+ if (rc)
+ break;
+ }
+
+ return rc;
+}
+
+static int __init add_deps(struct device_node *parent, struct device_node *node,
+ const struct of_device_id *matches)
+{
+ struct device_node *child;
+ int rc = 0;
+
+ if (!__of_device_is_available(node))
+ return 0;
+ if (__of_get_property(node, "compatible", NULL)) {
+ if (!parent->phandle) {
+ if (__of_get_property(parent, "compatible", NULL))
+ parent->phandle = 1 + order.max_phandle++;
+ }
+ if (!node->phandle)
+ node->phandle = 1 + order.max_phandle++;
+ rc = insert_edge(node->phandle, parent->phandle);
+ if (rc)
+ return rc;
+ if (unlikely(order.parent_by_phandle[node->phandle])) {
+ /* sanity check */
+ pr_err("0x%x already has a parent!\n", node->phandle);
+ return -EINVAL;
+ }
+ order.parent_by_phandle[node->phandle] = parent->phandle;
+ rc = add_dep_list(node, matches);
+ if (unlikely(rc))
+ return rc;
+ parent = node; /* change the parent only if node is a driver */
+ }
+ if (unlikely(matches && !of_match_node(matches, node)))
+ return rc;
+ for_each_child_of_node(node, child) {
+ rc = add_deps(parent, child, matches);
+ if (unlikely(rc))
+ break;
+ }
+
+ return rc;
+}
+
+static void __init calc_max_phandle(void)
+{
+ struct device_node *np;
+ uint32_t t = 0;
+
+ for (np = of_allnodes; np; np = np->allnext)
+ if (np->phandle > t)
+ t = np->phandle;
+ order.max_phandle = t;
+ return;
+}
+
+/*
+static void __init remove_new_phandles(void)
+{
+ struct device_node *np;
+
+ for (np = of_allnodes; np; np = np->allnext)
+ if (np->phandle > order.old_max_phandle)
+ np->phandle = 0;
+}
+
+static void __init of_init_print_order(void)
+{
+ unsigned i;
+ struct property *prop;
+ const char *cp;
+
+ pr_info("Initialization order:\n");
+ for (i = 0; i < order.count; ++i) {
+ pr_info("init %u 0x%x", i, order.order[i]->phandle);
+ if (order.order[i]->name)
+ pr_cont(" %s", order.order[i]->name);
+ if (order.order[i]->full_name)
+ pr_cont(" (%s)", order.order[i]->full_name);
+ prop = __of_find_property(order.order[i], "compatible", NULL);
+ for (cp = of_prop_next_string(prop, NULL); cp;
+ cp = of_prop_next_string(prop, cp))
+ pr_cont(" %s", cp);
+ pr_cont(" (parent 0x%x)\n",
+ order.parent_by_phandle[order.order[i]->phandle]);
+ }
+}
+*/
+
+int __init of_init_build_order(struct device_node *root,
+ const struct of_device_id *matches)
+{
+ struct device_node *child;
+ int rc = 0;
+
+ root = root ? of_node_get(root) : of_find_node_by_path("/");
+ if (unlikely(!root))
+ return -EINVAL;
+
+ calc_max_phandle();
+ order.old_max_phandle = order.max_phandle;
+
+ for_each_child_of_node(root, child) {
+ rc = add_deps(root, child, matches);
+ if (unlikely(rc))
+ break;
+ }
+
+ of_node_put(root);
+ topological_sort();
+
+ if (graph.finished)
+ return -EINVAL; /* cycle found */
+
+ /* of_init_print_order(); */
+
+ return rc;
+}
+
+void __init of_init_create_devices(const struct of_device_id *blacklist,
+ const struct of_dev_auxdata *lookup)
+{
+ unsigned i;
+ struct platform_device *dev;
+
+ for (i = 0; i < order.count; ++i) {
+ struct device_node *node = order.order[i];
+ uint32_t parent_ph = order.parent_by_phandle[node->phandle];
+
+ if (unlikely(blacklist &&
+ of_match_node(blacklist, node))) {
+ of_node_put(node);
+ continue;
+ }
+ if (unlikely(parent_ph &&
+ !order.device_by_phandle[parent_ph])) {
+ /* init of parent failed */
+ of_node_put(node);
+ continue;
+ }
+ dev = of_dependencies_device_create(node, lookup,
+ order.device_by_phandle[parent_ph]);
+ if (dev)
+ order.device_by_phandle[node->phandle] = &dev->dev;
+ of_node_put(node);
+ }
+ /* remove_new_phandles(); */
+}
+
+void __init of_init_free_order(void)
+{
+ unsigned i;
+
+ for (i = 0; i < order.count; ++i)
+ of_node_put(order.order[i]);
+ order.count = 0;
+ /* remove_new_phandles(); */
+}
diff --git a/drivers/of/platform.c b/drivers/of/platform.c
index 404d1da..0fe03ad 100644
--- a/drivers/of/platform.c
+++ b/drivers/of/platform.c
@@ -204,9 +204,13 @@ static struct platform_device *of_platform_device_create_pdata(
{
struct platform_device *dev;

+#ifdef CONFIG_OF_DEPENDENCIES
+ /* WARN_ON(np->dev_created); */
+ if (np->dev_created)
+ return np->dev_created;
+#endif
if (!of_device_is_available(np))
return NULL;
-
dev = of_device_alloc(np, bus_id, parent);
if (!dev)
return NULL;
@@ -229,7 +233,9 @@ static struct platform_device *of_platform_device_create_pdata(
platform_device_put(dev);
return NULL;
}
-
+#ifdef CONFIG_OF_DEPENDENCIES
+ np->dev_created = dev;
+#endif
return dev;
}

@@ -486,3 +492,25 @@ int of_platform_populate(struct device_node *root,
}
EXPORT_SYMBOL_GPL(of_platform_populate);
#endif /* CONFIG_OF_ADDRESS */
+
+#ifdef CONFIG_OF_DEPENDENCIES
+struct platform_device * __init of_dependencies_device_create(
+ struct device_node *bus,
+ const struct of_dev_auxdata *lookup,
+ struct device *parent)
+{
+ const struct of_dev_auxdata *auxdata;
+ const char *bus_id = NULL;
+ void *platform_data = NULL;
+
+ if (lookup) {
+ auxdata = of_dev_lookup(lookup, bus);
+ if (auxdata) {
+ bus_id = auxdata->name;
+ platform_data = auxdata->platform_data;
+ }
+ }
+ return of_platform_device_create_pdata(bus, bus_id, platform_data,
+ parent);
+}
+#endif
diff --git a/include/linux/of.h b/include/linux/of.h
index 435cb99..0bf0341 100644
--- a/include/linux/of.h
+++ b/include/linux/of.h
@@ -65,6 +65,21 @@ struct device_node {
unsigned int unique_id;
struct of_irq_controller *irq_trans;
#endif
+#ifdef CONFIG_OF_DEPENDENCIES
+ /*
+ * This is needed to keep track of already created devices.
+ * The reason is that some drivers call of_platform_populate()
+ * themself to populate e.g. their subtree. This would end up
+ * that some devices would be initialzed twice with a dependency
+ * based initialization. So instead of registering a device a second
+ * time, the second call to of_platform_device_create_pdata() just
+ * returns this pointer.
+ * If the feature of dependency based initialization will end up
+ * in mainline (and drivers will have fixed), this helper could
+ * be removed.
+ */
+ struct platform_device *dev_created;
+#endif
};

#define MAX_PHANDLE_ARGS 8
diff --git a/include/linux/of_dependencies.h b/include/linux/of_dependencies.h
new file mode 100644
index 0000000..e046ce2
--- /dev/null
+++ b/include/linux/of_dependencies.h
@@ -0,0 +1,61 @@
+#ifndef _LINUX_OF_DEPENDENCIES_H
+#define _LINUX_OF_DEPENDENCIES_H
+/*
+ * Definitions for building a deterministic initialization order based on
+ * dependencies defined in the device tree.
+ *
+ * Copyright (C) 2014 Alexander Holler <***@ahsoftware.de>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#include <linux/of_platform.h>
+
+/*
+ * Builds the initialization order.
+ *
+ * In favor of speed this function doesn't lock anything, so make sure nothing
+ * modifies the device tree while this functions runs.
+ *
+ * Will raise the refcount of all device tree nodes which ended up in the final
+ * initialization order by one.
+ *
+ * The phandle of some nodes will be modified (from 0 to a number) without
+ * adding a phandle property. But as this should not disturb anything, this
+ * change is not reversed after building the init order (for which the new
+ * phandles are temporarily necessary).
+ *
+ * This function is meant to be called only once.
+ */
+extern int of_init_build_order(struct device_node *root,
+ const struct of_device_id *matches);
+
+/*
+ * Replacement for of_platform_populate(). Creates all devices.
+ *
+ * By default it should be called with matches = NULL in order to create
+ * all devices. Reasoning is that every device contained in the DT which
+ * isn't disabled actually does exist (regardless if a driver is available
+ * or not).
+ *
+ * Decreases the node count of all nodes contained in the initialization order
+ * by one.
+ *
+ * This function is meant to be called only once.
+ */
+extern void of_init_create_devices(const struct of_device_id *matches,
+ const struct of_dev_auxdata *lookup);
+
+/*
+ * Decreases the node count of all nodes contained in the initialization order
+ * by one.
+ *
+ * This function is meant to be called only once instead of
+ * of_init_create_devices().
+ */
+extern void of_init_free_order(void);
+
+#endif /* _LINUX_OF_DEPENDENCIES_H */
diff --git a/include/linux/of_platform.h b/include/linux/of_platform.h
index 05cb4a9..04357ac 100644
--- a/include/linux/of_platform.h
+++ b/include/linux/of_platform.h
@@ -82,4 +82,9 @@ static inline int of_platform_populate(struct device_node *root,
}
#endif

+extern struct platform_device *of_dependencies_device_create(
+ struct device_node *bus,
+ const struct of_dev_auxdata *lookup,
+ struct device *parent);
+
#endif /* _LINUX_OF_PLATFORM_H */
--
1.8.3.1
Grant Likely
2014-05-14 14:05:34 UTC
Permalink
Post by Alexander Holler
Use the properties named 'dependencies' in binary device tree blobs to build
a dependency based initialization order for platform devices and drivers.
This is done by building a directed acyclic graph using an adjacency list
and doing a topological sort to retrieve the order in which devices/drivers
should be created/initialized.
Hi Alexander,

Thanks for looking at this. It is a difficult problem. I've made
comments below, but first I've got some general comments...

First, I'm going to be very cautious about this. It is a complicated
piece of code making the device initialization process a lot more
complicated than it already is. I'm the first one to admit that deferred
probe handles the problem in quite a naive manner, but it is simple,
correct (when drivers support deferred probe) and easy to audit. This
series digs deep into the registration order of *both* devices and
drivers which gives me the heebee jeebees.

Personally, I think the parts of this patch that manipulate the device registration
order is entirely the wrong way to handle it. If anything, I would say
continue to register the devices, even if the dependencies are unmet.
Instead, I would focus on the driver core (drivers/base) to catch
device probe attempts when a known dependency is not met, remember it,
and perform the probe after the other driver shows up. That is also
going to be a complicated bit of code, but it works for every kind of
device, not just platform_devices, and has far less impact on the
platform setup code.

BTW, this has to be able to work at the level of struct device instead
of struct platform_device. There are far more kinds of devices than just
platform_device, and they all have the same problem.

Also, may I suggest that the more pieces that you can break this series
up into, the greater chance you'll have of getting a smaller subset
merged earlier if it can be proven to be useful on its own.
Post by Alexander Holler
---
arch/arm/kernel/setup.c | 20 +-
drivers/of/Kconfig | 10 +
drivers/of/Makefile | 1 +
drivers/of/of_dependencies.c | 403 ++++++++++++++++++++++++++++++++++++++++
drivers/of/platform.c | 32 +++-
include/linux/of.h | 15 ++
include/linux/of_dependencies.h | 61 ++++++
include/linux/of_platform.h | 5 +
8 files changed, 543 insertions(+), 4 deletions(-)
create mode 100644 drivers/of/of_dependencies.c
create mode 100644 include/linux/of_dependencies.h
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 1e8b030..f67387d 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -19,6 +19,7 @@
#include <linux/seq_file.h>
#include <linux/screen_info.h>
#include <linux/of_platform.h>
+#include <linux/of_dependencies.h>
#include <linux/init.h>
#include <linux/kexec.h>
#include <linux/of_fdt.h>
@@ -787,10 +788,19 @@ static int __init customize_machine(void)
if (machine_desc->init_machine)
machine_desc->init_machine();
#ifdef CONFIG_OF
- else
+ else {
+#ifdef CONFIG_OF_DEPENDENCIES
+ if (!of_init_build_order(NULL, NULL))
+ of_init_create_devices(NULL, NULL);
+ else
+ of_init_free_order();
What happens when of_init_build_order() fails? Does the whole system
fall over?
Post by Alexander Holler
+#else
of_platform_populate(NULL, of_default_bus_match_table,
NULL, NULL);
#endif
+ }
+#endif
+
return 0;
}
arch_initcall(customize_machine);
@@ -914,7 +924,13 @@ void __init setup_arch(char **cmdline_p)
arm_pm_restart = mdesc->restart;
unflatten_device_tree();
-
+#ifdef CONFIG_OF_DEPENDENCIES
+ /*
+ * No alloc used in of_init_build_order(), therefor it would work
+ * already here too.
+ */
+ /* of_init_build_order(NULL, NULL); */
+#endif
Stale hunk left in patch?
Post by Alexander Holler
arm_dt_init_cpu_maps();
psci_init();
#ifdef CONFIG_SMP
diff --git a/drivers/of/Kconfig b/drivers/of/Kconfig
index c6973f1..a7e1614 100644
--- a/drivers/of/Kconfig
+++ b/drivers/of/Kconfig
@@ -75,4 +75,14 @@ config OF_MTD
depends on MTD
def_bool y
+config OF_DEPENDENCIES
+ bool "Device Tree dependency based initialization order (EXPERIMENTAL)"
+ help
+ Enables dependency based initialization order of platform drivers.
+
+ For correct operation the binary DT blob should have been
+ populated with properties of type "dependencies".
+
+ If unsure, say N here.
+
endmenu # OF
diff --git a/drivers/of/Makefile b/drivers/of/Makefile
index efd0510..3888d9c 100644
--- a/drivers/of/Makefile
+++ b/drivers/of/Makefile
@@ -9,3 +9,4 @@ obj-$(CONFIG_OF_MDIO) += of_mdio.o
obj-$(CONFIG_OF_PCI) += of_pci.o
obj-$(CONFIG_OF_PCI_IRQ) += of_pci_irq.o
obj-$(CONFIG_OF_MTD) += of_mtd.o
+obj-$(CONFIG_OF_DEPENDENCIES) += of_dependencies.o
diff --git a/drivers/of/of_dependencies.c b/drivers/of/of_dependencies.c
new file mode 100644
index 0000000..7905172
--- /dev/null
+++ b/drivers/of/of_dependencies.c
@@ -0,0 +1,403 @@
+/*
+ * Code for building a deterministic initialization order based on dependencies
+ * defined in the device tree.
+ *
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#include <linux/of_dependencies.h>
+
+#define MAX_DT_NODES 1000 /* maximum number of vertices */
+#define MAX_EDGES (MAX_DT_NODES*2) /* maximum number of edges (dependencies) */
Is it possible for this to be dynamic? DT platforms range from small to
very very large.
Post by Alexander Holler
+
+struct edgenode {
+ uint32_t y; /* phandle */
+ struct edgenode *next; /* next edge in list */
+};
+
+/*
+ * Vertex numbers do correspond to phandle numbers. That means the graph
+ * does contain as much vertices as the maximum of all phandles.
+ * Or in other words, we assume that for all phandles in the device tree
+ * 0 < phandle < MAX_DT_NODES+1 is true.
+ */
+struct dep_graph {
+ struct edgenode edge_slots[MAX_EDGES]; /* used to avoid kmalloc */
+ struct edgenode *edges[MAX_DT_NODES+1]; /* adjacency info */
+ unsigned nvertices; /* number of vertices in graph */
+ unsigned nedges; /* number of edges in graph */
+ bool processed[MAX_DT_NODES+1]; /* which vertices have been processed */
+ bool include_node[MAX_DT_NODES+1]; /* which nodes to consider */
+ bool discovered[MAX_DT_NODES+1]; /* which vertices have been found */
+ bool finished; /* if true, cut off search immediately */
+};
+static struct dep_graph graph __initdata;
+
+struct init_order {
+ uint32_t max_phandle; /* the max used phandle */
+ uint32_t old_max_phandle; /* used to keep track of added phandles */
+ struct device_node *order[MAX_DT_NODES+1];
+ unsigned count;
+ /* Used to keep track of parent devices in regard to the DT */
+ uint32_t parent_by_phandle[MAX_DT_NODES+1];
+ struct device *device_by_phandle[MAX_DT_NODES+1];
+};
+static struct init_order order __initdata;
I would suggest splitting the core graph support into a separate patch
to keep things smaller and to keep the behaviour changes separate from
the support function additions.
Post by Alexander Holler
+
+
+/* Copied from drivers/of/base.c (because it's lockless). */
Copying isn't a good idea. The function will need to be made accessible
to other files in the drivers/of directory.
Post by Alexander Holler
+static struct property * __init __of_find_property(const struct device_node *np,
+ const char *name, int *lenp)
+{
+ struct property *pp;
+
+ if (!np)
+ return NULL;
+
+ for (pp = np->properties; pp; pp = pp->next) {
+ if (of_prop_cmp(pp->name, name) == 0) {
+ if (lenp)
+ *lenp = pp->length;
+ break;
+ }
+ }
+
+ return pp;
+}
+
+/* Copied from drivers/of/base.c (because it's lockless). */
+static const void * __init __of_get_property(const struct device_node *np,
+ const char *name, int *lenp)
+{
+ struct property *pp = __of_find_property(np, name, lenp);
+
+ return pp ? pp->value : NULL;
+}
+
+/* Copied from drivers/of/base.c (because it's lockless). */
+static int __init __of_device_is_available(const struct device_node *device)
+{
+ const char *status;
+ int statlen;
+
+ if (!device)
+ return 0;
+
+ status = __of_get_property(device, "status", &statlen);
+ if (status == NULL)
+ return 1;
+
+ if (statlen > 0) {
+ if (!strcmp(status, "okay") || !strcmp(status, "ok"))
+ return 1;
+ }
+
+ return 0;
+}
+
+/*
+ * x is a dependant of y or in other words
+ * y will be initialized before x.
+ */
+static int __init insert_edge(uint32_t x, uint32_t y)
+{
+ struct edgenode *p; /* temporary pointer */
+
+ if (unlikely(x > MAX_DT_NODES || y > MAX_DT_NODES)) {
+ pr_err("Node found with phandle 0x%x > MAX_DT_NODES (%d)!\n",
+ x > MAX_DT_NODES ? x : y, MAX_DT_NODES);
+ return -EINVAL;
+ }
+ if (unlikely(!x || !y))
+ return 0;
+ if (unlikely(graph.nedges >= MAX_EDGES)) {
+ pr_err("Maximum number of edges (%d) reached!\n", MAX_EDGES);
+ return -EINVAL;
+ }
+ p = &graph.edge_slots[graph.nedges++];
+ graph.include_node[x] = 1;
+ graph.include_node[y] = 1;
+ p->y = y;
+ p->next = graph.edges[x];
+ graph.edges[x] = p; /* insert at head of list */
+
+ graph.nvertices = (x > graph.nvertices) ? x : graph.nvertices;
+ graph.nvertices = (y > graph.nvertices) ? y : graph.nvertices;
+ return 0;
+}
+
+static void __init print_node_name(uint32_t v)
+{
+ struct device_node *node;
+
+ node = of_find_node_by_phandle(v);
+ if (!node) {
+ pr_cont("Node for phandle 0x%x not found", v);
+ return;
+ }
+ if (node->name)
+ pr_cont("%s", node->name);
+ if (node->full_name)
+ pr_cont(" (%s)", node->full_name);
+ of_node_put(node);
+}
+
+/*
+ * I would prefer to use the BGL (Boost Graph Library), but as I can't use it
+ * here (for obvious reasons), the next four functions below are based on
+ * code of Steven Skiena's book 'The Algorithm Design Manual'.
+ */
+
+static void __init process_edge(uint32_t x, uint32_t y)
+{
+ if (unlikely(graph.discovered[y] && !graph.processed[y])) {
+ pr_err("Cycle found 0x%x ", x);
+ print_node_name(x);
+ pr_cont(" <-> 0x%x ", y);
+ print_node_name(y);
+ pr_cont("!\n");
+ graph.finished = 1;
+ }
+}
+
+static void __init process_vertex_late(uint32_t v)
+{
+ struct device_node *node;
+
+ node = of_find_node_by_phandle(v);
+ if (!node) {
+ pr_err("No node for phandle 0x%x not found", v);
+ return;
+ }
+ order.order[order.count++] = node;
+}
+
+static void __init depth_first_search(uint32_t v)
+{
+ struct edgenode *p;
+ uint32_t y; /* successor vertex */
+
+ if (graph.finished)
+ return;
+ graph.discovered[v] = 1;
+ p = graph.edges[v];
+ while (p) {
+ y = p->y;
+ if (!graph.discovered[y]) {
+ process_edge(v, y);
+ depth_first_search(y);
+ } else
+ process_edge(v, y);
+ if (graph.finished)
+ return;
+ p = p->next;
+ }
+ process_vertex_late(v);
+ graph.processed[v] = 1;
+}
+
+static void __init topological_sort(void)
+{
+ unsigned i;
+
+ for (i = 1; i <= graph.nvertices; ++i)
+ if (!graph.discovered[i] && graph.include_node[i])
+ depth_first_search(i);
+}
+
+static int __init add_dep_list(struct device_node *node,
+ const struct of_device_id *matches)
+{
+ const __be32 *list, *list_end;
+ uint32_t ph;
+ int size;
+ int rc = 0;
+ struct device_node *dep;
+
+ list = __of_get_property(node, "dependencies", &size);
+ if (!list || !size || size%sizeof(*list))
+ return 0;
+ list_end = list + size / sizeof(*list);
+ while (list < list_end) {
+ ph = be32_to_cpup(list++);
+ if (unlikely(!ph)) {
+ /* Should never happen */
+ if (node->name)
+ pr_warn("phandle == 0 for %s\n", node->name);
+ continue;
+ }
+ dep = of_find_node_by_phandle(ph);
+ if (unlikely(!dep)) {
+ pr_err("No DT node for dependency with phandle 0x%x found\n",
+ ph);
+ continue;
+ }
+ if (unlikely(matches && !of_match_node(matches, dep)))
+ continue;
+ rc = insert_edge(node->phandle, ph);
+ if (rc)
+ break;
+ }
+
+ return rc;
+}
+
+static int __init add_deps(struct device_node *parent, struct device_node *node,
+ const struct of_device_id *matches)
+{
+ struct device_node *child;
+ int rc = 0;
+
+ if (!__of_device_is_available(node))
+ return 0;
+ if (__of_get_property(node, "compatible", NULL)) {
+ if (!parent->phandle) {
+ if (__of_get_property(parent, "compatible", NULL))
+ parent->phandle = 1 + order.max_phandle++;
+ }
+ if (!node->phandle)
+ node->phandle = 1 + order.max_phandle++;
+ rc = insert_edge(node->phandle, parent->phandle);
+ if (rc)
+ return rc;
+ if (unlikely(order.parent_by_phandle[node->phandle])) {
+ /* sanity check */
+ pr_err("0x%x already has a parent!\n", node->phandle);
+ return -EINVAL;
+ }
+ order.parent_by_phandle[node->phandle] = parent->phandle;
+ rc = add_dep_list(node, matches);
+ if (unlikely(rc))
+ return rc;
+ parent = node; /* change the parent only if node is a driver */
+ }
+ if (unlikely(matches && !of_match_node(matches, node)))
+ return rc;
+ for_each_child_of_node(node, child) {
+ rc = add_deps(parent, child, matches);
+ if (unlikely(rc))
+ break;
+ }
+
+ return rc;
+}
+
+static void __init calc_max_phandle(void)
+{
+ struct device_node *np;
+ uint32_t t = 0;
+
+ for (np = of_allnodes; np; np = np->allnext)
+ if (np->phandle > t)
+ t = np->phandle;
+ order.max_phandle = t;
+ return;
+}
+
+/*
+static void __init remove_new_phandles(void)
+{
+ struct device_node *np;
+
+ for (np = of_allnodes; np; np = np->allnext)
+ if (np->phandle > order.old_max_phandle)
+ np->phandle = 0;
+}
+
+static void __init of_init_print_order(void)
+{
+ unsigned i;
+ struct property *prop;
+ const char *cp;
+
+ pr_info("Initialization order:\n");
+ for (i = 0; i < order.count; ++i) {
+ pr_info("init %u 0x%x", i, order.order[i]->phandle);
+ if (order.order[i]->name)
+ pr_cont(" %s", order.order[i]->name);
+ if (order.order[i]->full_name)
+ pr_cont(" (%s)", order.order[i]->full_name);
+ prop = __of_find_property(order.order[i], "compatible", NULL);
+ for (cp = of_prop_next_string(prop, NULL); cp;
+ cp = of_prop_next_string(prop, cp))
+ pr_cont(" %s", cp);
+ pr_cont(" (parent 0x%x)\n",
+ order.parent_by_phandle[order.order[i]->phandle]);
+ }
+}
+*/
+
+int __init of_init_build_order(struct device_node *root,
+ const struct of_device_id *matches)
+{
+ struct device_node *child;
+ int rc = 0;
+
+ root = root ? of_node_get(root) : of_find_node_by_path("/");
+ if (unlikely(!root))
+ return -EINVAL;
+
+ calc_max_phandle();
+ order.old_max_phandle = order.max_phandle;
+
+ for_each_child_of_node(root, child) {
+ rc = add_deps(root, child, matches);
+ if (unlikely(rc))
+ break;
+ }
+
+ of_node_put(root);
+ topological_sort();
Can the sort be performed incrementally? The DT is a dynamic structure
on some platforms. Search for OF_RECONFIG_. There is work in progress to
add overlay support to the device tree so that batches of new nodes can
be added to the tree after userspace has started. The dependency code
will need to handle that situation gracefully.
Post by Alexander Holler
+
+ if (graph.finished)
+ return -EINVAL; /* cycle found */
+
+ /* of_init_print_order(); */
If you wrap of_init_print_order with a #ifdef DEBUG/#else/#endif, then
you don't need to comment out the call to of_init_print_order().
Post by Alexander Holler
+
+ return rc;
+}
+
+void __init of_init_create_devices(const struct of_device_id *blacklist,
+ const struct of_dev_auxdata *lookup)
+{
+ unsigned i;
+ struct platform_device *dev;
I don't like that of_init_create_devices() has a completely different
calling convention from of_platform_populate(). of_platform_populate()
is passed a match table for devices that are to act as buses (which
means register the children also). This function is passed a blacklist
instead which is a completely different semantic.

That means it cannot be used by device drivers that register their own
children and it has to make a lot of assumptions about what should and
should not be registered as platform_devices.

How does the dependency code decide which devices can be
platform_devices? It's not clear to me from what I've read so far.
Post by Alexander Holler
+
+ for (i = 0; i < order.count; ++i) {
+ struct device_node *node = order.order[i];
+ uint32_t parent_ph = order.parent_by_phandle[node->phandle];
+
+ if (unlikely(blacklist &&
+ of_match_node(blacklist, node))) {
+ of_node_put(node);
+ continue;
+ }
+ if (unlikely(parent_ph &&
+ !order.device_by_phandle[parent_ph])) {
+ /* init of parent failed */
+ of_node_put(node);
+ continue;
+ }
+ dev = of_dependencies_device_create(node, lookup,
+ order.device_by_phandle[parent_ph]);
+ if (dev)
+ order.device_by_phandle[node->phandle] = &dev->dev;
+ of_node_put(node);
+ }
+ /* remove_new_phandles(); */
+}
I could use some help understanding what is being done here. It looks
like it is going through and only registering devices that have a
dependency parent already created, or don't have a parent at all. Am I
correct?

It looks like this patch alone will break the kernel because it depends
also on the functionality in patch 5. The patches would need to be
reordered to handle that situation.
Post by Alexander Holler
+void __init of_init_free_order(void)
+{
+ unsigned i;
+
+ for (i = 0; i < order.count; ++i)
+ of_node_put(order.order[i]);
+ order.count = 0;
+ /* remove_new_phandles(); */
+}
diff --git a/drivers/of/platform.c b/drivers/of/platform.c
index 404d1da..0fe03ad 100644
--- a/drivers/of/platform.c
+++ b/drivers/of/platform.c
@@ -204,9 +204,13 @@ static struct platform_device *of_platform_device_create_pdata(
{
struct platform_device *dev;
+#ifdef CONFIG_OF_DEPENDENCIES
+ /* WARN_ON(np->dev_created); */
+ if (np->dev_created)
+ return np->dev_created;
+#endif
if (!of_device_is_available(np))
return NULL;
-
dev = of_device_alloc(np, bus_id, parent);
if (!dev)
return NULL;
@@ -229,7 +233,9 @@ static struct platform_device *of_platform_device_create_pdata(
platform_device_put(dev);
return NULL;
}
-
+#ifdef CONFIG_OF_DEPENDENCIES
+ np->dev_created = dev;
+#endif
return dev;
}
@@ -486,3 +492,25 @@ int of_platform_populate(struct device_node *root,
}
EXPORT_SYMBOL_GPL(of_platform_populate);
#endif /* CONFIG_OF_ADDRESS */
+
+#ifdef CONFIG_OF_DEPENDENCIES
+struct platform_device * __init of_dependencies_device_create(
+ struct device_node *bus,
+ const struct of_dev_auxdata *lookup,
+ struct device *parent)
+{
+ const struct of_dev_auxdata *auxdata;
+ const char *bus_id = NULL;
+ void *platform_data = NULL;
+
+ if (lookup) {
+ auxdata = of_dev_lookup(lookup, bus);
+ if (auxdata) {
+ bus_id = auxdata->name;
+ platform_data = auxdata->platform_data;
+ }
+ }
+ return of_platform_device_create_pdata(bus, bus_id, platform_data,
+ parent);
+}
+#endif
diff --git a/include/linux/of.h b/include/linux/of.h
index 435cb99..0bf0341 100644
--- a/include/linux/of.h
+++ b/include/linux/of.h
@@ -65,6 +65,21 @@ struct device_node {
unsigned int unique_id;
struct of_irq_controller *irq_trans;
#endif
+#ifdef CONFIG_OF_DEPENDENCIES
+ /*
+ * This is needed to keep track of already created devices.
+ * The reason is that some drivers call of_platform_populate()
+ * themself to populate e.g. their subtree. This would end up
+ * that some devices would be initialzed twice with a dependency
+ * based initialization. So instead of registering a device a second
+ * time, the second call to of_platform_device_create_pdata() just
+ * returns this pointer.
+ * If the feature of dependency based initialization will end up
+ * in mainline (and drivers will have fixed), this helper could
+ * be removed.
+ */
+ struct platform_device *dev_created;
+#endif
This change has been proposed many times in several forms. I've pushed
back hard against it because it is absolutely normal for multiple
struct devices to reference the same device tree node.

There is a patch that Pawel has submitted which sets a flag in the
of_node if the of_platform_populate() path has been used to create a
struct device. It's a little crude, but it does handle the problem of
multiple passes through of_platform_populate(). I expect it will show up
in mainline in v3.16
Post by Alexander Holler
};
#define MAX_PHANDLE_ARGS 8
diff --git a/include/linux/of_dependencies.h b/include/linux/of_dependencies.h
new file mode 100644
index 0000000..e046ce2
--- /dev/null
+++ b/include/linux/of_dependencies.h
@@ -0,0 +1,61 @@
+#ifndef _LINUX_OF_DEPENDENCIES_H
+#define _LINUX_OF_DEPENDENCIES_H
+/*
+ * Definitions for building a deterministic initialization order based on
+ * dependencies defined in the device tree.
+ *
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; either version
+ * 2 of the License, or (at your option) any later version.
+ */
+
+#include <linux/of_platform.h>
+
+/*
+ * Builds the initialization order.
+ *
+ * In favor of speed this function doesn't lock anything, so make sure nothing
+ * modifies the device tree while this functions runs.
+ *
+ * Will raise the refcount of all device tree nodes which ended up in the final
+ * initialization order by one.
+ *
+ * The phandle of some nodes will be modified (from 0 to a number) without
+ * adding a phandle property. But as this should not disturb anything, this
+ * change is not reversed after building the init order (for which the new
+ * phandles are temporarily necessary).
+ *
+ * This function is meant to be called only once.
+ */
Note: Convention in the kernel (maybe not everywhere, but certainly in
the DT code) is for function documentation to be in the .c file, not the
header. We use the kerneldoc format.

Documentation/kernel-doc-nano-HOWTO.txt
Post by Alexander Holler
+extern int of_init_build_order(struct device_node *root,
+ const struct of_device_id *matches);
+
+/*
+ * Replacement for of_platform_populate(). Creates all devices.
+ *
+ * By default it should be called with matches = NULL in order to create
+ * all devices. Reasoning is that every device contained in the DT which
+ * isn't disabled actually does exist (regardless if a driver is available
+ * or not).
+ *
+ * Decreases the node count of all nodes contained in the initialization order
+ * by one.
+ *
+ * This function is meant to be called only once.
+ */
+extern void of_init_create_devices(const struct of_device_id *matches,
+ const struct of_dev_auxdata *lookup);
+
+/*
+ * Decreases the node count of all nodes contained in the initialization order
+ * by one.
+ *
+ * This function is meant to be called only once instead of
+ * of_init_create_devices().
+ */
+extern void of_init_free_order(void);
+
+#endif /* _LINUX_OF_DEPENDENCIES_H */
diff --git a/include/linux/of_platform.h b/include/linux/of_platform.h
index 05cb4a9..04357ac 100644
--- a/include/linux/of_platform.h
+++ b/include/linux/of_platform.h
@@ -82,4 +82,9 @@ static inline int of_platform_populate(struct device_node *root,
}
#endif
+extern struct platform_device *of_dependencies_device_create(
+ struct device_node *bus,
+ const struct of_dev_auxdata *lookup,
+ struct device *parent);
+
#endif /* _LINUX_OF_PLATFORM_H */
--
1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Alexander Holler
2014-05-14 14:49:05 UTC
Permalink
Post by Grant Likely
Post by Alexander Holler
Use the properties named 'dependencies' in binary device tree blobs to build
a dependency based initialization order for platform devices and drivers.
This is done by building a directed acyclic graph using an adjacency list
and doing a topological sort to retrieve the order in which devices/drivers
should be created/initialized.
Hi Alexander,
Thanks for looking at this. It is a difficult problem. I've made
comments below, but first I've got some general comments...
First, I'm going to be very cautious about this. It is a complicated
piece of code making the device initialization process a lot more
complicated than it already is. I'm the first one to admit that deferred
probe handles the problem in quite a naive manner, but it is simple,
correct (when drivers support deferred probe) and easy to audit. This
series digs deep into the registration order of *both* devices and
drivers which gives me the heebee jeebees.
Sure, but the approach I present is deterministic. The deferred stuff,
while it's simple and works, is imho just a workaround. Besides that
this series isn't about pro or cons of the deferred probe, the deferred
probes I have seen where just the reason why I had a look at what
happens. To conclude, I like the deferred probe because it fixes
problems, but trying to do things right is much better. And there are
already many workarounds trying fix the initialization order (e.g.
drivers which classify themself as a subsys), so starting do it right
makes imho sense.

So, I'm sorry if you see this feature as an attack on the deferred probe
stuff, it isn't meant as such, no offense here or there.
Post by Grant Likely
Personally, I think the parts of this patch that manipulate the device registration
order is entirely the wrong way to handle it. If anything, I would say
continue to register the devices, even if the dependencies are unmet.
That would just be a removal of 2 lines. I've no problem with that. ;)
Post by Grant Likely
Instead, I would focus on the driver core (drivers/base) to catch
device probe attempts when a known dependency is not met, remember it,
and perform the probe after the other driver shows up. That is also
going to be a complicated bit of code, but it works for every kind of
device, not just platform_devices, and has far less impact on the
platform setup code.
BTW, this has to be able to work at the level of struct device instead
of struct platform_device. There are far more kinds of devices than just
platform_device, and they all have the same problem.
Changing to care for devices instead of just drivers is easy to do.
Post by Grant Likely
Also, may I suggest that the more pieces that you can break this series
up into, the greater chance you'll have of getting a smaller subset
merged earlier if it can be proven to be useful on its own.
Hmm, I don't really care if that will be merged. I have no motivation to
fight with Linux kernel maintainers and I don't know if I will spend the
necessary time to do so.
Post by Grant Likely
Post by Alexander Holler
+#ifdef CONFIG_OF_DEPENDENCIES
+ if (!of_init_build_order(NULL, NULL))
+ of_init_create_devices(NULL, NULL);
+ else
+ of_init_free_order();
What happens when of_init_build_order() fails? Does the whole system
fall over?
Yes. The only reason it can fail is when there is a cycle, and dtc
checks (and fails) for that when building the blob (dtb).
Post by Grant Likely
Post by Alexander Holler
+#else
of_platform_populate(NULL, of_default_bus_match_table,
NULL, NULL);
#endif
+ }
+#endif
+
return 0;
}
arch_initcall(customize_machine);
@@ -914,7 +924,13 @@ void __init setup_arch(char **cmdline_p)
arm_pm_restart = mdesc->restart;
unflatten_device_tree();
-
+#ifdef CONFIG_OF_DEPENDENCIES
+ /*
+ * No alloc used in of_init_build_order(), therefor it would work
+ * already here too.
+ */
+ /* of_init_build_order(NULL, NULL); */
+#endif
Stale hunk left in patch?
See here: https://lkml.org/lkml/2014/5/14/102

This are NOT patches meant for final merging!
Post by Grant Likely
I would suggest splitting the core graph support into a separate patch
to keep things smaller and to keep the behaviour changes separate from
the support function additions.
Could be done.
Post by Grant Likely
Post by Alexander Holler
+
+
+/* Copied from drivers/of/base.c (because it's lockless). */
Copying isn't a good idea. The function will need to be made accessible
to other files in the drivers/of directory.
See above.
Post by Grant Likely
Post by Alexander Holler
+int __init of_init_build_order(struct device_node *root,
+ const struct of_device_id *matches)
+{
+ struct device_node *child;
+ int rc = 0;
+
+ root = root ? of_node_get(root) : of_find_node_by_path("/");
+ if (unlikely(!root))
+ return -EINVAL;
+
+ calc_max_phandle();
+ order.old_max_phandle = order.max_phandle;
+
+ for_each_child_of_node(root, child) {
+ rc = add_deps(root, child, matches);
+ if (unlikely(rc))
+ break;
+ }
+
+ of_node_put(root);
+ topological_sort();
Can the sort be performed incrementally? The DT is a dynamic structure
on some platforms. Search for OF_RECONFIG_. There is work in progress to
add overlay support to the device tree so that batches of new nodes can
be added to the tree after userspace has started. The dependency code
will need to handle that situation gracefully.
The stuff I present is only about what happens before userspace starts
and is all gone away when userspace start. I know about the overlay
support (e.g. for bbb-capes), but I don't care. So there is no need to
think about what happens if such happens.
Post by Grant Likely
I don't like that of_init_create_devices() has a completely different
calling convention from of_platform_populate(). of_platform_populate()
is passed a match table for devices that are to act as buses (which
means register the children also). This function is passed a blacklist
instead which is a completely different semantic.
Acting on buses is a workaround.
Post by Grant Likely
That means it cannot be used by device drivers that register their own
children and it has to make a lot of assumptions about what should and
should not be registered as platform_devices.
How does the dependency code decide which devices can be
platform_devices? It's not clear to me from what I've read so far.
Dependencies currently are only considered on stuff which has a
"compatibility" property, thus drivers. I wanted to get the drivers
loaded in order, not really caring for devices. Let me quote from
(outdated) ldd3:

"For the most part, the Linux device model code takes care of all these
considerations without imposing itself upon driver authors. It sits
mostly in the background; direct interaction with the device model is
generally handled by bus-level logic and various other kernel
subsystems. As a result, many driver authors can ignore the device model
entirely, and trust it to take care of itself."

So do I. ;)
Post by Grant Likely
Post by Alexander Holler
+
+ for (i = 0; i < order.count; ++i) {
+ struct device_node *node = order.order[i];
+ uint32_t parent_ph = order.parent_by_phandle[node->phandle];
+
+ if (unlikely(blacklist &&
+ of_match_node(blacklist, node))) {
+ of_node_put(node);
+ continue;
+ }
+ if (unlikely(parent_ph &&
+ !order.device_by_phandle[parent_ph])) {
+ /* init of parent failed */
+ of_node_put(node);
+ continue;
+ }
+ dev = of_dependencies_device_create(node, lookup,
+ order.device_by_phandle[parent_ph]);
+ if (dev)
+ order.device_by_phandle[node->phandle] = &dev->dev;
+ of_node_put(node);
+ }
+ /* remove_new_phandles(); */
+}
I could use some help understanding what is being done here. It looks
like it is going through and only registering devices that have a
dependency parent already created, or don't have a parent at all. Am I
correct?
Yes, that part assumes that if a parent is present, the parent is needed
and it doesn't make sense to create a device if the parent already
failed. That are those two lines I mentioned above.
Post by Grant Likely
It looks like this patch alone will break the kernel because it depends
also on the functionality in patch 5. The patches would need to be
reordered to handle that situation.
I currently don't care if this feature breaks something. Therefor it is
marked in big letters as experimental. But I already see you don't want
it and you see it all as an offense.

Regards,

Alexander Holler
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alexander Holler
2014-05-14 17:20:21 UTC
Permalink
Post by Alexander Holler
On Mon, 12 May 2014 18:47:53 +0200, Alexander Holler
Personally, I think the parts of this patch that manipulate the device registration
order is entirely the wrong way to handle it. If anything, I would say
continue to register the devices, even if the dependencies are unmet.
(...)
Post by Alexander Holler
How does the dependency code decide which devices can be
platform_devices? It's not clear to me from what I've read so far.
Dependencies currently are only considered on stuff which has a
"compatibility" property, thus drivers. I wanted to get the drivers
loaded in order, not really caring for devices. Let me quote from
"For the most part, the Linux device model code takes care of all these
considerations without imposing itself upon driver authors. It sits
mostly in the background; direct interaction with the device model is
generally handled by bus-level logic and various other kernel
subsystems. As a result, many driver authors can ignore the device model
entirely, and trust it to take care of itself."
So do I. ;)
To explain a bit further, I've started with totally ignoring the device
model just careing for the order in why drivers will be initialized.

Than the device model did come into my way. ;)

But it isn't any problem at all to extend the stuff to care for devices.
That even would reduce some code in dtc with the disadvantage that the
sizes of blobs will slightly increase a bit more, because they then
would include dependencies to devices too (instead of just dependencies
between drivers).

So I'm absolutely open here. If using dependencies between devices is
necessary or has advantages, that could be changed with changing a few
lines.

Regards,

Alexander Holler
Grant Likely
2014-05-14 20:06:45 UTC
Permalink
Post by Alexander Holler
Post by Grant Likely
Post by Alexander Holler
Use the properties named 'dependencies' in binary device tree blobs to build
a dependency based initialization order for platform devices and drivers.
This is done by building a directed acyclic graph using an adjacency list
and doing a topological sort to retrieve the order in which devices/drivers
should be created/initialized.
Hi Alexander,
Thanks for looking at this. It is a difficult problem. I've made
comments below, but first I've got some general comments...
First, I'm going to be very cautious about this. It is a complicated
piece of code making the device initialization process a lot more
complicated than it already is. I'm the first one to admit that deferred
probe handles the problem in quite a naive manner, but it is simple,
correct (when drivers support deferred probe) and easy to audit. This
series digs deep into the registration order of *both* devices and
drivers which gives me the heebee jeebees.
Sure, but the approach I present is deterministic. The deferred stuff,
while it's simple and works, is imho just a workaround. Besides that
this series isn't about pro or cons of the deferred probe, the deferred
probes I have seen where just the reason why I had a look at what
happens. To conclude, I like the deferred probe because it fixes
problems, but trying to do things right is much better. And there are
already many workarounds trying fix the initialization order (e.g.
drivers which classify themself as a subsys), so starting do it right
makes imho sense.
So, I'm sorry if you see this feature as an attack on the deferred probe
stuff, it isn't meant as such, no offense here or there.
Hahaha, calm down. I'm not upset nor do I see it as an attack on
deferred probe.... Okay, maybe I do, but I've been asking people to
attack deferred probe for years! It works, but it is by no means optimized.
Post by Alexander Holler
Post by Grant Likely
Personally, I think the parts of this patch that manipulate the device registration
order is entirely the wrong way to handle it. If anything, I would say
continue to register the devices, even if the dependencies are unmet.
That would just be a removal of 2 lines. I've no problem with that. ;)
Post by Grant Likely
Instead, I would focus on the driver core (drivers/base) to catch
device probe attempts when a known dependency is not met, remember it,
and perform the probe after the other driver shows up. That is also
going to be a complicated bit of code, but it works for every kind of
device, not just platform_devices, and has far less impact on the
platform setup code.
BTW, this has to be able to work at the level of struct device instead
of struct platform_device. There are far more kinds of devices than just
platform_device, and they all have the same problem.
Changing to care for devices instead of just drivers is easy to do.
Post by Grant Likely
Also, may I suggest that the more pieces that you can break this series
up into, the greater chance you'll have of getting a smaller subset
merged earlier if it can be proven to be useful on its own.
Hmm, I don't really care if that will be merged. I have no motivation to
fight with Linux kernel maintainers and I don't know if I will spend the
necessary time to do so.
The people you need to convince are Rob, Greg, and me, and you've got my
attention. If I'm convinced, then I can probably convince Greg also.
You've got an interesting approach, and I hope you won't give up on it.
Post by Alexander Holler
Post by Grant Likely
Post by Alexander Holler
+#ifdef CONFIG_OF_DEPENDENCIES
+ if (!of_init_build_order(NULL, NULL))
+ of_init_create_devices(NULL, NULL);
+ else
+ of_init_free_order();
What happens when of_init_build_order() fails? Does the whole system
fall over?
Yes. The only reason it can fail is when there is a cycle, and dtc
checks (and fails) for that when building the blob (dtb).
Post by Grant Likely
Post by Alexander Holler
+#else
of_platform_populate(NULL, of_default_bus_match_table,
NULL, NULL);
#endif
+ }
+#endif
+
return 0;
}
arch_initcall(customize_machine);
@@ -914,7 +924,13 @@ void __init setup_arch(char **cmdline_p)
arm_pm_restart = mdesc->restart;
unflatten_device_tree();
-
+#ifdef CONFIG_OF_DEPENDENCIES
+ /*
+ * No alloc used in of_init_build_order(), therefor it would work
+ * already here too.
+ */
+ /* of_init_build_order(NULL, NULL); */
+#endif
Stale hunk left in patch?
See here: https://lkml.org/lkml/2014/5/14/102
This are NOT patches meant for final merging!
Understood. No malice is intended. That's just the normal stuff that
comes up in review.
Post by Alexander Holler
Post by Grant Likely
I would suggest splitting the core graph support into a separate patch
to keep things smaller and to keep the behaviour changes separate from
the support function additions.
Could be done.
Post by Grant Likely
Post by Alexander Holler
+
+
+/* Copied from drivers/of/base.c (because it's lockless). */
Copying isn't a good idea. The function will need to be made accessible
to other files in the drivers/of directory.
See above.
Post by Grant Likely
Post by Alexander Holler
+int __init of_init_build_order(struct device_node *root,
+ const struct of_device_id *matches)
+{
+ struct device_node *child;
+ int rc = 0;
+
+ root = root ? of_node_get(root) : of_find_node_by_path("/");
+ if (unlikely(!root))
+ return -EINVAL;
+
+ calc_max_phandle();
+ order.old_max_phandle = order.max_phandle;
+
+ for_each_child_of_node(root, child) {
+ rc = add_deps(root, child, matches);
+ if (unlikely(rc))
+ break;
+ }
+
+ of_node_put(root);
+ topological_sort();
Can the sort be performed incrementally? The DT is a dynamic structure
on some platforms. Search for OF_RECONFIG_. There is work in progress to
add overlay support to the device tree so that batches of new nodes can
be added to the tree after userspace has started. The dependency code
will need to handle that situation gracefully.
The stuff I present is only about what happens before userspace starts
and is all gone away when userspace start. I know about the overlay
support (e.g. for bbb-capes), but I don't care. So there is no need to
think about what happens if such happens.
ok. As long as there is no impact, then I don't have an issue.
Post by Alexander Holler
Post by Grant Likely
I don't like that of_init_create_devices() has a completely different
calling convention from of_platform_populate(). of_platform_populate()
is passed a match table for devices that are to act as buses (which
means register the children also). This function is passed a blacklist
instead which is a completely different semantic.
Acting on buses is a workaround.
Can you elaborate on what you mean? The of_platform_populate() semantics
are quite specific because they are the filter for which devices are
going to be platform_devices.
Post by Alexander Holler
Post by Grant Likely
That means it cannot be used by device drivers that register their own
children and it has to make a lot of assumptions about what should and
should not be registered as platform_devices.
How does the dependency code decide which devices can be
platform_devices? It's not clear to me from what I've read so far.
Dependencies currently are only considered on stuff which has a
"compatibility" property, thus drivers. I wanted to get the drivers
loaded in order, not really caring for devices. Let me quote from
"For the most part, the Linux device model code takes care of all these
considerations without imposing itself upon driver authors. It sits
mostly in the background; direct interaction with the device model is
generally handled by bus-level logic and various other kernel
subsystems. As a result, many driver authors can ignore the device model
entirely, and trust it to take care of itself."
So do I. ;)
of_platform_populate() currently makes the decision for
platform_devices. Other busses do the same for their own child devices.
In this case after calculating the dependency information we would want
to make it available to all bus types so that they to can take advantage
of dependency information.
Post by Alexander Holler
Post by Grant Likely
Post by Alexander Holler
+
+ for (i = 0; i < order.count; ++i) {
+ struct device_node *node = order.order[i];
+ uint32_t parent_ph = order.parent_by_phandle[node->phandle];
+
+ if (unlikely(blacklist &&
+ of_match_node(blacklist, node))) {
+ of_node_put(node);
+ continue;
+ }
+ if (unlikely(parent_ph &&
+ !order.device_by_phandle[parent_ph])) {
+ /* init of parent failed */
+ of_node_put(node);
+ continue;
+ }
+ dev = of_dependencies_device_create(node, lookup,
+ order.device_by_phandle[parent_ph]);
+ if (dev)
+ order.device_by_phandle[node->phandle] = &dev->dev;
+ of_node_put(node);
+ }
+ /* remove_new_phandles(); */
+}
I could use some help understanding what is being done here. It looks
like it is going through and only registering devices that have a
dependency parent already created, or don't have a parent at all. Am I
correct?
Yes, that part assumes that if a parent is present, the parent is needed
and it doesn't make sense to create a device if the parent already
failed. That are those two lines I mentioned above.
Ah, so it is just the normal node's parent, not a dependency link. okay.
Post by Alexander Holler
Post by Grant Likely
It looks like this patch alone will break the kernel because it depends
also on the functionality in patch 5. The patches would need to be
reordered to handle that situation.
I currently don't care if this feature breaks something. Therefor it is
marked in big letters as experimental. But I already see you don't want
it and you see it all as an offense.
I'm sorry you feel that way. I actually have quite the opposite opinion.
I'm asking the questions and pushing back to a) make sure I understand
what you're doing, and b) figure out wether it can be brought into a
state where it can be merged.

g.

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alexander Holler
2014-05-14 21:10:39 UTC
Permalink
Post by Grant Likely
Post by Alexander Holler
Hmm, I don't really care if that will be merged. I have no motivation to
fight with Linux kernel maintainers and I don't know if I will spend the
necessary time to do so.
The people you need to convince are Rob, Greg, and me, and you've got my
attention. If I'm convinced, then I can probably convince Greg also.
You've got an interesting approach, and I hope you won't give up on it.
Sorry, but that point of the view is already a problem. Why do I have to
convince you people?

To summarize:

Linux kernel maintainers

- don't like code they haven't written theirself,
- don't like code which doesn't look like they have written it theirself,
- don't like ideas they haven't had themself,
- do feel good in their closed group they have formed,
...

You see, I long have given up. The reason I still posted these patches
is just because I don't care anymore about Linux kernel maintainers at
all. I was prepared that I do that just for the fun of annoying some
Linux kernel maintainers. I've already decided some time ago that I just
will post my patches once to the LKML (so that other poor Linux kernel
users may find them) and will ignore Linux kernel maintainers.

Sorry, but I have a long and very unpleasant history in dealing with
Linux kernel maintainers, and they already have called me almost
anything like "ugly code writer", "frustrated" and such things more.

Fortunately I'm too old to care about such, that's why I still post
patches (besides that I like to solve problems and help other people). ;)

Regards,

Alexander Holler
Grant Likely
2014-05-16 11:00:08 UTC
Permalink
Post by Alexander Holler
Post by Grant Likely
Post by Alexander Holler
Hmm, I don't really care if that will be merged. I have no motivation to
fight with Linux kernel maintainers and I don't know if I will spend the
necessary time to do so.
The people you need to convince are Rob, Greg, and me, and you've got my
attention. If I'm convinced, then I can probably convince Greg also.
You've got an interesting approach, and I hope you won't give up on it.
Sorry, but that point of the view is already a problem. Why do I have to
convince you people?
Linux kernel maintainers
- don't like code they haven't written theirself,
- don't like code which doesn't look like they have written it theirself,
- don't like ideas they haven't had themself,
- do feel good in their closed group they have formed,
I'm sorry that you feel that way and that you don't want to continue
with this. Best wishes.

g.

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alexander Holler
2014-05-18 09:53:45 UTC
Permalink
Post by Grant Likely
Post by Alexander Holler
Post by Grant Likely
Post by Alexander Holler
Hmm, I don't really care if that will be merged. I have no motivation to
fight with Linux kernel maintainers and I don't know if I will spend the
necessary time to do so.
The people you need to convince are Rob, Greg, and me, and you've got my
attention. If I'm convinced, then I can probably convince Greg also.
You've got an interesting approach, and I hope you won't give up on it.
Sorry, but that point of the view is already a problem. Why do I have to
convince you people?
Linux kernel maintainers
- don't like code they haven't written theirself,
- don't like code which doesn't look like they have written it theirself,
- don't like ideas they haven't had themself,
- do feel good in their closed group they have formed,
I'm sorry that you feel that way and that you don't want to continue
with this. Best wishes.
Sorry to hit you with reality, but it just will not happen that I will
try to convience any Linux kernel maintainer (anymore). It is their job
to decide what they want and not mine to convince them. I offer code and
arguments, but I will not offer my pride, ego or how you want to name
it. If Linux kernel maintainers are unable to deal with the power
they've got, that isn't my problem.

Regards,

Alexander Holler
Alexander Shiyan
2014-05-16 17:31:20 UTC
Permalink
Post by Alexander Holler
Post by Grant Likely
Post by Alexander Holler
Hmm, I don't really care if that will be merged. I have no motivation to
fight with Linux kernel maintainers and I don't know if I will spend the
necessary time to do so.
The people you need to convince are Rob, Greg, and me, and you've got my
attention. If I'm convinced, then I can probably convince Greg also.
You've got an interesting approach, and I hope you won't give up on it.
Sorry, but that point of the view is already a problem. Why do I have to
convince you people?
Linux kernel maintainers
- don't like code they haven't written theirself,
- don't like code which doesn't look like they have written it theirself,
- don't like ideas they haven't had themself,
- do feel good in their closed group they have formed,
...
Very correct and well said.
For many subsystems is it as is, we cannot break these barriers.

---
Alexander Holler
2014-05-14 15:51:36 UTC
Permalink
Post by Grant Likely
Post by Alexander Holler
+
+ if (graph.finished)
+ return -EINVAL; /* cycle found */
+
+ /* of_init_print_order(); */
If you wrap of_init_print_order with a #ifdef DEBUG/#else/#endif, then
you don't need to comment out the call to of_init_print_order().
To explain why I didn't use DEBUG here: DEBUG enables a lot of
distracting messages. which is true for CONFIG_DEBUG_DRIVER too.
Therefor I often prefer to use just pr_info("AHO: ..."); with which I
print only stuff I want (and can easily grep for).

And as said, the patches I presented are meant for evaluation. I did
patches without discussing them before in order to avoid endless
discussion which likely never would have found an end and therfor would
have prevented a start. So now you already have some to play and test
with, without anyone had to discuss stuff before. ;)

Regards,

Alexander Holler
Tomasz Figa
2014-05-17 14:24:19 UTC
Permalink
Hi,
Post by Grant Likely
Post by Alexander Holler
Use the properties named 'dependencies' in binary device tree blobs to build
a dependency based initialization order for platform devices and drivers.
This is done by building a directed acyclic graph using an adjacency list
and doing a topological sort to retrieve the order in which devices/drivers
should be created/initialized.
Hi Alexander,
Thanks for looking at this. It is a difficult problem. I've made
comments below, but first I've got some general comments...
First, I'm going to be very cautious about this. It is a complicated
piece of code making the device initialization process a lot more
complicated than it already is. I'm the first one to admit that deferred
probe handles the problem in quite a naive manner, but it is simple,
correct (when drivers support deferred probe) and easy to audit. This
series digs deep into the registration order of *both* devices and
drivers which gives me the heebee jeebees.
Personally, I think the parts of this patch that manipulate the device registration
order is entirely the wrong way to handle it. If anything, I would say
continue to register the devices, even if the dependencies are unmet.
Instead, I would focus on the driver core (drivers/base) to catch
device probe attempts when a known dependency is not met, remember it,
and perform the probe after the other driver shows up. That is also
going to be a complicated bit of code, but it works for every kind of
device, not just platform_devices, and has far less impact on the
platform setup code.
Grant, I tend to disagree with you on this. While Alexander's solution
has certain flaws (that I'm going to list further in my reply), I also
believe that an approach based on device registration order is most
likely the way to go. As compared to other possible approaches, here is
the list of advantages I can see (in random order):

1) If compared with resource-based approach, when you detect
dependencies at runtime, based on existing resource specifiers (GPIOs,
clocks, etc.), you don't need to change anything in the implementation
whenever a new kind of resources is introduced. Moreover, there is no
need to make device probing code aware of any resources or dependencies,
because all you need to do is to register devices in certain order, as
defined by precompiled dependency lists.

2) If implemented properly, it helps solving problem of binding proper
driver to a device with multiple compatible strings. Current way of
handling this by Linux is _broken_, because the device will be bound to
first driver that shows up and contains matching compatible string in
its match table. Notice that this way the whole idea of having multiple
compatible strings specified from most specific to most generic is made
useless. Now you may wonder how both problems relate. Basically in both
cases you need to wait until drivers for devices are available (e.g.
registered in system-wide list), either to guarantee that registering a
device means probing it (in first case) or to look through the list of
available drivers and select the one that is a best match (in second case).

3) DeviceTree is not the only firmware "interface" supported by Linux
and so I'd be careful with pushing this into generic driver core that is
also shared with board files and ACPI and possibly something else I'm
not aware of. Moreover, I think the existing driver core is already
quite complex and complicating it even more might not be the best idea,
unless really the only option.

4) This approach is far less complicated than anything mentioned above.
What's so complicated in creating a graph of devices and registering
them in certain order?
Post by Grant Likely
BTW, this has to be able to work at the level of struct device instead
of struct platform_device. There are far more kinds of devices than just
platform_device, and they all have the same problem.
Agreed.
Post by Grant Likely
Also, may I suggest that the more pieces that you can break this series
up into, the greater chance you'll have of getting a smaller subset
merged earlier if it can be proven to be useful on its own.
Agreed. It is usually a good idea to separate things that could live on
their own and be useful.

Now, I'll spare myself from judging the code, as until we get an
accepted design, I don't think there is any point in discussing about
implementation details, not even mentioning things like coding style
(which is important, but not when much of the code might still be
rewritten completely).

OK, so I mentioned above what I like in this kind of approach. Now let's
move to what I don't like.

I think the part that alters driver registration and initcalls isn't
really necessary. With current code, we can see that initcalls
themselves (not driver code in terms of driver model) are already well
ordered, as things happening there seems to work, without any need to
defer anything. Now Alexander's approach relies on
module_platform_driver() and similar macros to obtain the list of
platform_drivers in the system, but I think this isn't necessary either.

Now this is just a bit of guessing, as I still haven't been able to
allocate time to take a deeper look into initcall and driver code, but
what if we let the initcalls do their work, let's say up to
late_initcall level and only then register drivers in our precomputed
order? We seem to be already relying an assumption that on late_initcall
level the devices should be already probed, as we have various calls
disabling unused resources, such as regulators and clocks, at this
level. I can see certain drivers being registered in late_initcalls, but
this would be after our device registration, so most of dependencies
should be already there and if not, we still have deferred probing.

With such design in place, we would be able to also solve the other
problem I mentioned above, the problem of matching devices with most
appropriate drivers. Since at device registration and probing time,
(almost) all the drivers would be already available, the matching code
could select the right one to bind.

Alright, that's my take on this. Looking forward to your comments.

Best regards,
Tomasz
Grant Likely
2014-05-18 14:59:22 UTC
Permalink
Hi Tomasz,

Thanks for weighing in on this. Thoughts and comments below.
Post by Tomasz Figa
Hi,
Post by Grant Likely
Post by Alexander Holler
Use the properties named 'dependencies' in binary device tree blobs to build
a dependency based initialization order for platform devices and drivers.
This is done by building a directed acyclic graph using an adjacency list
and doing a topological sort to retrieve the order in which devices/drivers
should be created/initialized.
Hi Alexander,
Thanks for looking at this. It is a difficult problem. I've made
comments below, but first I've got some general comments...
First, I'm going to be very cautious about this. It is a complicated
piece of code making the device initialization process a lot more
complicated than it already is. I'm the first one to admit that deferred
probe handles the problem in quite a naive manner, but it is simple,
correct (when drivers support deferred probe) and easy to audit. This
series digs deep into the registration order of *both* devices and
drivers which gives me the heebee jeebees.
Personally, I think the parts of this patch that manipulate the device registration
order is entirely the wrong way to handle it. If anything, I would say
continue to register the devices, even if the dependencies are unmet.
Instead, I would focus on the driver core (drivers/base) to catch
device probe attempts when a known dependency is not met, remember it,
and perform the probe after the other driver shows up. That is also
going to be a complicated bit of code, but it works for every kind of
device, not just platform_devices, and has far less impact on the
platform setup code.
Grant, I tend to disagree with you on this. While Alexander's solution
has certain flaws (that I'm going to list further in my reply), I also
believe that an approach based on device registration order is most
likely the way to go. As compared to other possible approaches, here is
1) If compared with resource-based approach, when you detect
dependencies at runtime, based on existing resource specifiers (GPIOs,
clocks, etc.), you don't need to change anything in the implementation
whenever a new kind of resources is introduced. Moreover, there is no
need to make device probing code aware of any resources or dependencies,
because all you need to do is to register devices in certain order, as
defined by precompiled dependency lists.
I think we can handle the source of dependencies separately from how
those depenencies are used. I would rather not have a separate set of
properties for dependency tracking because of the possiblity of it
being made inconsistent, but I'm not flat out refusing.
Post by Tomasz Figa
2) If implemented properly, it helps solving problem of binding proper
driver to a device with multiple compatible strings. Current way of
handling this by Linux is _broken_, because the device will be bound to
first driver that shows up and contains matching compatible string in
its match table. Notice that this way the whole idea of having multiple
compatible strings specified from most specific to most generic is made
useless. Now you may wonder how both problems relate. Basically in both
cases you need to wait until drivers for devices are available (e.g.
registered in system-wide list), either to guarantee that registering a
device means probing it (in first case) or to look through the list of
available drivers and select the one that is a best match (in second case).
No argument here.
Post by Tomasz Figa
3) DeviceTree is not the only firmware "interface" supported by Linux
and so I'd be careful with pushing this into generic driver core that is
also shared with board files and ACPI and possibly something else I'm
not aware of.
Competely agree. Anything that goes in to driver core cannot be OF
specific. If the driver core is modified, then it needs to be
generically useful regardless of firmware interface.
Post by Tomasz Figa
Moreover, I think the existing driver core is already
quite complex and complicating it even more might not be the best idea,
unless really the only option.
I think it actually is the only option. More below.
Post by Tomasz Figa
4) This approach is far less complicated than anything mentioned above.
What's so complicated in creating a graph of devices and registering
them in certain order?
As Alexander's patch series shows, merely manipulating the registration
order of devices doesn't actually solve anything. For most platform
devices, link order has a far large impact on probe order than
registration order. Alexander had to hook into the driver registration
patch to get the optimal probe order. For device order to be effective,
all driver registration would need to occur before devices are
registered (opposite of what we do now).

The driver core is very simple in this regard. It accepts device and
driver registration. When one gets registered, it immediately attempts
to match against one of the others. Simple and easy to understand, but
very non-deterministic behaviour.

A lot of devices gets registered well before the driver, platform_devices
especially. Other devices may very well show up afterwards, such as
anything registered by another driver. For example, it is the
responsibility of an i2c bus driver to register all the child i2c
devices. By the time the i2c bus driver gets probed, the i2c drivers may
already be available. And to make things worse, any device could depend
on any other regardless of bus type.

To actually solve the problem we need to deal with dependencies across
all devices, regardless of bus type and regardless of whether or not the
driver gets registered first. Tackling only device order, or only driver
order misses a whole bunch of situations.

Alexander's patch is the right idea here. It collects a bunch of driver
registrations for an initcall without calling probe so that a sorting pass
can be performed first. That approach can solve both the dependency
order and the compatible list problems, and I think it is worth
exploring.

Having said that, there are some things that I worry about. I worry
about the cost of doing dependency sorting, both in calculating the
dependency tree and in pushing back probe calls to the end of initcalls.
I worry that incorrect dependency information will cause some devices to
not get bound (say because the kernel things the dependency isn't met
when it actually is).

Again, that doesn't mean I'm saying "don't do this, it is bad". It just
means those are the corner cases and performance issues that I want to
make sure are handled well. They are the questions I'll be asking before
I merge anything. I'd be thrilled for someone to continue this work.
Post by Tomasz Figa
Post by Grant Likely
Also, may I suggest that the more pieces that you can break this series
up into, the greater chance you'll have of getting a smaller subset
merged earlier if it can be proven to be useful on its own.
I think the part that alters driver registration and initcalls isn't
really necessary. With current code, we can see that initcalls
themselves (not driver code in terms of driver model) are already well
ordered, as things happening there seems to work, without any need to
defer anything. Now Alexander's approach relies on
module_platform_driver() and similar macros to obtain the list of
platform_drivers in the system, but I think this isn't necessary either.
Hahaha, as described above, this is where I think Alexander is on the
right path!!!
Post by Tomasz Figa
Now this is just a bit of guessing, as I still haven't been able to
allocate time to take a deeper look into initcall and driver code, but
what if we let the initcalls do their work, let's say up to
late_initcall level and only then register drivers in our precomputed
order?
If we can compute an ideal driver registration order, then this will
always be a helpful thing to do. It doesn't catch everything, but it can
make a lot of things better.

Cheers,
g.
Post by Tomasz Figa
We seem to be already relying an assumption that on late_initcall
level the devices should be already probed, as we have various calls
disabling unused resources, such as regulators and clocks, at this
level. I can see certain drivers being registered in late_initcalls, but
this would be after our device registration, so most of dependencies
should be already there and if not, we still have deferred probing.
With such design in place, we would be able to also solve the other
problem I mentioned above, the problem of matching devices with most
appropriate drivers. Since at device registration and probing time,
(almost) all the drivers would be already available, the matching code
could select the right one to bind.
Alright, that's my take on this. Looking forward to your comments.
Best regards,
Tomasz
Alexander Holler
2014-05-19 08:41:11 UTC
Permalink
Post by Grant Likely
Hi Tomasz,
Thanks for weighing in on this. Thoughts and comments below.
registration order. Alexander had to hook into the driver registration
patch to get the optimal probe order. For device order to be effective,
I had to hook into the driver registration to get information about the
(available) drivers. Without the hook it is currently impossible to
identify drivers before they start doing things.

To reccover, I had to solve several problems:

- Getting dependencies (happens almost automatically by using phandle
references)
- Get them to the kernel (done by using a new property)
- Build order (already a solved problem, think at make)
- Identify available drivers (invented hook, "well done" is meant in
regard to this feature, I needed a name and found "well done" apropriate
because it too might stimulate driver authors to use it)
- Check out how to handle/start/register devices and drivers (in order).

I think the last one is the most unfinished and questionable part.

The part to identify drivers could be done much better by linking an
array of struct platform_driver, but in order to use such an array,
drivers have to be done "well done" too (which means no action before
probe). So that well-done hook can be seen as an intermediate step.
Post by Grant Likely
Having said that, there are some things that I worry about. I worry
about the cost of doing dependency sorting, both in calculating the
dependency tree and in pushing back probe calls to the end of initcalls.
Building and calculating the dependency tree just needs a few ms and I
think it's much faster than what is necessary afterwards, all those
string compares to match drivers/devices.

But this string compares already do happen, and I think this part could
optimized a lot, when a list of drivers and their compatibility strings
is available. Then it's possible to build a hash or e.g. radix tree
which leads from the compatibility string to the available driver(s).
Post by Grant Likely
I worry that incorrect dependency information will cause some devices to
not get bound (say because the kernel things the dependency isn't met
when it actually is).
All (not started) drivers and (unbounded) devices can still be
registered/bound after those which appear in the order. That would be
just like before.

But as said, the whole handling which happens after the order was build
is done quick & dirty, done with missing knownledge about the device
model, and might contain a lot of bugs and even might need that some
drivers will be changed.

Therefor all changes disappear when CONFIG_OF_DEPENDENCIES is disabled.
So tested platforms might use it (taking advantage of a deterministic
order in order to get rid of hardcoded stuff to fix the order) and
others don't have to care.

So, as already said, I've posted these patches to make evaluation easy,
without the need to discuss just ideas but something real to play with
(in order to get something happen on this front, not just hardcoded
hacks done in individual drivers because such passes maintainers easier).

I didn't cared much about form or how to split those patches into more
convenient parts. That is stuff where I just do it like a maintainer
does want it. I did them as I like them, and I don't want to end up in a
time wasting discussions about form, style or similiar questions.

So if anyone would be comfortable to merge these patches (for evaluation
by others) in other form or splitted in more parts, I will just hear and do.

I also don't have any objections in changes in the stuff which happens
after the order was build. In fact I would even like it if someone with
more experience with the driver model would do it. I just had to do
something there too, otherwise it would still have been just an idea
which wouldn't offer much motivation to actually look at it.

Regards,

Alexander Holler
Alexander Holler
2014-05-12 16:47:54 UTC
Permalink
Add option -t to print the default initialization order.
No other output will be generated.

To print the order, just use something like this:

CROSS_COMPILE=gcc-foo ARCH=arm make foo.dtb
scripts/dtc/dtc -I dtb -t arch/arm/boot/dts/foo.dtb

Since it's now possible to check to for cycles in the dependency graph,
this is now done too.

Signed-off-by: Alexander Holler <holler-SXC+***@public.gmane.org>
---
scripts/dtc/dependencies.c | 346 +++++++++++++++++++++++++++++++++++++++++++++
scripts/dtc/dtc.c | 24 +++-
scripts/dtc/dtc.h | 2 +
3 files changed, 371 insertions(+), 1 deletion(-)

diff --git a/scripts/dtc/dependencies.c b/scripts/dtc/dependencies.c
index dd4658c..8fe1a8c 100644
--- a/scripts/dtc/dependencies.c
+++ b/scripts/dtc/dependencies.c
@@ -106,3 +106,349 @@ void add_dependencies(struct boot_info *bi)
{
process_nodes_props(bi->dt, bi->dt);
}
+
+/*
+ * The code below is in large parts a copy of drivers/of/of_dependencies.c
+ * in the Linux kernel. So both files do share the same bugs.
+ * The next few ugly defines do exist to keep the differences at a minimum.
+ */
+static struct node *tree;
+#define pr_cont(format, ...) printf(format, ##__VA_ARGS__)
+#define pr_info(format, ...) printf(format, ##__VA_ARGS__)
+#define pr_warn(format, ...) printf(format, ##__VA_ARGS__)
+#define pr_err(format, ...) fprintf(stderr, format, ##__VA_ARGS__)
+typedef cell_t __be32;
+#define device_node node
+#define full_name fullpath
+#define __initdata
+#define __init
+#define unlikely(a) (a)
+#define of_node_put(a)
+#define of_find_node_by_phandle(v) get_node_by_phandle(tree, v)
+#define __of_get_property(a, b, c) get_property(a, b)
+#define for_each_child_of_node(a, b) for_each_child(a, b)
+
+
+#define MAX_DT_NODES 1000 /* maximum number of vertices */
+#define MAX_EDGES (MAX_DT_NODES*2) /* maximum number of edges (dependencies) */
+
+struct edgenode {
+ uint32_t y; /* phandle */
+ struct edgenode *next; /* next edge in list */
+};
+
+/*
+ * Vertex numbers do correspond to phandle numbers. That means the graph
+ * does contain as much vertices as the maximum of all phandles.
+ * Or in other words, we assume that for all phandles in the device tree
+ * 0 < phandle < MAX_DT_NODES+1 is true.
+ */
+struct dep_graph {
+ struct edgenode edge_slots[MAX_EDGES]; /* used to avoid kmalloc */
+ struct edgenode *edges[MAX_DT_NODES+1]; /* adjacency info */
+ unsigned nvertices; /* number of vertices in graph */
+ unsigned nedges; /* number of edges in graph */
+ bool processed[MAX_DT_NODES+1]; /* which vertices have been processed */
+ bool include_node[MAX_DT_NODES+1]; /* which nodes to consider */
+ bool discovered[MAX_DT_NODES+1]; /* which vertices have been found */
+ bool finished; /* if true, cut off search immediately */
+};
+static struct dep_graph graph __initdata;
+
+struct init_order {
+ uint32_t max_phandle; /* the max used phandle */
+ uint32_t old_max_phandle; /* used to keep track of added phandles */
+ struct device_node *order[MAX_DT_NODES+1];
+ unsigned count;
+ /* Used to keep track of parent devices in regard to the DT */
+ uint32_t parent_by_phandle[MAX_DT_NODES+1];
+ struct device *device_by_phandle[MAX_DT_NODES+1];
+};
+static struct init_order order __initdata;
+
+
+/* Copied from drivers/of/base.c (because it's lockless). */
+static int __init __of_device_is_available(struct device_node *device)
+{
+ struct property *status;
+
+ if (!device)
+ return 0;
+
+ status = get_property(device, "status");
+ if (status == NULL)
+ return 1;
+
+ if (status->val.len > 0) {
+ if (!strcmp(status->val.val, "okay") ||
+ !strcmp(status->val.val, "ok"))
+ return 1;
+ }
+
+ return 0;
+}
+
+/*
+ * x is a dependant of y or in other words
+ * y will be initialized before x.
+ */
+static int __init insert_edge(uint32_t x, uint32_t y)
+{
+ struct edgenode *p; /* temporary pointer */
+
+ if (unlikely(x > MAX_DT_NODES || y > MAX_DT_NODES)) {
+ pr_err("Node found with phandle 0x%x > MAX_DT_NODES (%d)!\n",
+ x > MAX_DT_NODES ? x : y, MAX_DT_NODES);
+ return -EINVAL;
+ }
+ if (unlikely(!x || !y))
+ return 0;
+ if (unlikely(graph.nedges >= MAX_EDGES)) {
+ pr_err("Maximum number of edges (%d) reached!\n", MAX_EDGES);
+ return -EINVAL;
+ }
+ p = &graph.edge_slots[graph.nedges++];
+ graph.include_node[x] = 1;
+ graph.include_node[y] = 1;
+ p->y = y;
+ p->next = graph.edges[x];
+ graph.edges[x] = p; /* insert at head of list */
+
+ graph.nvertices = (x > graph.nvertices) ? x : graph.nvertices;
+ graph.nvertices = (y > graph.nvertices) ? y : graph.nvertices;
+ return 0;
+}
+
+static void __init print_node_name(uint32_t v)
+{
+ struct device_node *node;
+
+ node = of_find_node_by_phandle(v);
+ if (!node) {
+ pr_err("Node for phandle 0x%x not found", v);
+ return;
+ }
+ if (node->name)
+ pr_err("%s", node->name);
+ if (node->full_name)
+ pr_err(" (%s)", node->full_name);
+ of_node_put(node);
+}
+
+/*
+ * I would prefer to use the BGL (Boost Graph Library), but as I can't use it
+ * here (for obvious reasons), the next four functions below are based on
+ * code of Steven Skiena's book 'The Algorithm Design Manual'.
+ */
+
+static void __init process_edge(uint32_t x, uint32_t y)
+{
+ if (unlikely(graph.discovered[y] && !graph.processed[y])) {
+ pr_err("Cycle found 0x%x ", x);
+ print_node_name(x);
+ pr_cont(" <-> 0x%x ", y);
+ print_node_name(y);
+ pr_cont("!\n");
+ graph.finished = 1;
+ }
+}
+
+static void __init process_vertex_late(uint32_t v)
+{
+ struct device_node *node;
+
+ node = of_find_node_by_phandle(v);
+ if (!node) {
+ pr_err("No node for phandle 0x%x not found", v);
+ return;
+ }
+ order.order[order.count++] = node;
+}
+
+static void __init depth_first_search(uint32_t v)
+{
+ struct edgenode *p;
+ uint32_t y; /* successor vertex */
+
+ if (graph.finished)
+ return;
+ graph.discovered[v] = 1;
+ p = graph.edges[v];
+ while (p) {
+ y = p->y;
+ if (!graph.discovered[y]) {
+ process_edge(v, y);
+ depth_first_search(y);
+ } else
+ process_edge(v, y);
+ if (graph.finished)
+ return;
+ p = p->next;
+ }
+ process_vertex_late(v);
+ graph.processed[v] = 1;
+}
+
+static void __init topological_sort(void)
+{
+ unsigned i;
+
+ for (i = 1; i <= graph.nvertices; ++i)
+ if (!graph.discovered[i] && graph.include_node[i])
+ depth_first_search(i);
+}
+
+static int __init add_dep_list(struct device_node *node)
+{
+ const __be32 *list, *list_end;
+ uint32_t ph;
+ struct property *prop;
+ int rc = 0;
+ struct device_node *dep;
+
+ prop = get_property(node, "dependencies");
+ if (!prop || !prop->val.len || prop->val.len%sizeof(*list))
+ return 0;
+ list = (const __be32 *)prop->val.val;
+ list_end = list + prop->val.len / sizeof(*list);
+ while (list < list_end) {
+ ph = fdt32_to_cpu(*list++);
+ if (unlikely(!ph)) {
+ /* Should never happen */
+ if (node->name)
+ pr_warn("phandle == 0 for %s\n", node->name);
+ continue;
+ }
+ dep = of_find_node_by_phandle(ph);
+ if (unlikely(!dep)) {
+ pr_err("No DT node for dependency with phandle 0x%x found\n",
+ ph);
+ continue;
+ }
+ rc = insert_edge(node->phandle, ph);
+ if (rc)
+ break;
+ }
+
+ return rc;
+}
+
+/* Copied from drivers/of/base.c */
+static const char *of_prop_next_string(struct property *prop, const char *cur)
+{
+ const char *curv = cur;
+
+ if (!prop)
+ return NULL;
+
+ if (!cur)
+ return prop->val.val;
+
+ curv += strlen(cur) + 1;
+ if (curv >= prop->val.val + prop->val.len)
+ return NULL;
+
+ return curv;
+}
+
+static int __init add_deps_lnx(struct device_node *parent,
+ struct device_node *node)
+{
+ struct device_node *child;
+ int rc = 0;
+
+ if (!__of_device_is_available(node))
+ return 0;
+ if (__of_get_property(node, "compatible", NULL)) {
+ if (!parent->phandle) {
+ if (__of_get_property(parent, "compatible", NULL))
+ parent->phandle = 1 + order.max_phandle++;
+ }
+ if (!node->phandle)
+ node->phandle = 1 + order.max_phandle++;
+ rc = insert_edge(node->phandle, parent->phandle);
+ if (rc)
+ return rc;
+ if (unlikely(order.parent_by_phandle[node->phandle])) {
+ /* sanity check */
+ pr_err("0x%x already has a parent!\n", node->phandle);
+ return -EINVAL;
+ }
+ order.parent_by_phandle[node->phandle] = parent->phandle;
+ rc = add_dep_list(node);
+ if (unlikely(rc))
+ return rc;
+ parent = node; /* change the parent only if node is a driver */
+ }
+ for_each_child_of_node(node, child) {
+ rc = add_deps_lnx(parent, child);
+ if (unlikely(rc))
+ break;
+ }
+
+ return rc;
+}
+
+static void calc_max_phandle(struct node *np)
+{
+ struct node *child;
+
+ if (!np || np->deleted)
+ return;
+ if (np->phandle > order.max_phandle)
+ order.max_phandle = np->phandle;
+
+ for_each_child(np, child)
+ calc_max_phandle(child);
+
+ return;
+}
+
+void __init of_init_print_order(const char *name)
+{
+ unsigned i;
+ struct property *prop;
+ const char *cp;
+
+ pr_info("Default initialization order for %s:\n", name);
+ for (i = 0; i < order.count; ++i) {
+ pr_info("init %u 0x%x", i, order.order[i]->phandle);
+ if (order.order[i]->name)
+ pr_cont(" %s", order.order[i]->name);
+ if (order.order[i]->full_name)
+ pr_cont(" (%s)", order.order[i]->full_name);
+ prop = get_property(order.order[i], "compatible");
+ for (cp = of_prop_next_string(prop, NULL); cp;
+ cp = of_prop_next_string(prop, cp))
+ pr_cont(" %s", cp);
+ pr_cont(" (parent 0x%x)\n",
+ order.parent_by_phandle[order.order[i]->phandle]);
+ }
+}
+
+int __init of_init_build_order(struct device_node *root)
+{
+ struct device_node *child;
+ int rc = 0;
+
+ tree = root;
+ if (unlikely(!root))
+ return -EINVAL;
+
+ calc_max_phandle(root);
+ order.old_max_phandle = order.max_phandle;
+
+ for_each_child_of_node(root, child) {
+ rc = add_deps_lnx(root, child);
+ if (unlikely(rc))
+ break;
+ }
+
+ of_node_put(root);
+ topological_sort();
+
+ if (graph.finished)
+ return -EINVAL; /* cycle found */
+
+ return rc;
+}
diff --git a/scripts/dtc/dtc.c b/scripts/dtc/dtc.c
index fbe49d9..ac9858c 100644
--- a/scripts/dtc/dtc.c
+++ b/scripts/dtc/dtc.c
@@ -88,6 +88,8 @@ static void __attribute__ ((noreturn)) usage(void)
fprintf(stderr, "\t\tSort nodes and properties before outputting (only useful for\n\t\tcomparing trees)\n");
fprintf(stderr, "\t-D\n");
fprintf(stderr, "\t\tDo not automatically add dependencies for phandle references\n");
+ fprintf(stderr, "\t-t\n");
+ fprintf(stderr, "\t\tPrint (default) initialization order\n");
fprintf(stderr, "\t-v\n");
fprintf(stderr, "\t\tPrint DTC version and exit\n");
fprintf(stderr, "\t-H <phandle format>\n");
@@ -110,6 +112,7 @@ int main(int argc, char *argv[])
const char *depname = NULL;
int force = 0, sort = 0;
int dependencies = 1;
+ int init_order = 0;
const char *arg;
int opt;
FILE *outf = NULL;
@@ -121,7 +124,7 @@ int main(int argc, char *argv[])
minsize = 0;
padsize = 0;

- while ((opt = getopt(argc, argv, "hI:O:o:V:d:R:S:p:fqb:i:vH:sDW:E:"))
+ while ((opt = getopt(argc, argv, "hI:O:o:V:d:R:S:p:fqb:i:vH:sDtW:E:"))
!= EOF) {
switch (opt) {
case 'I':
@@ -183,6 +186,10 @@ int main(int argc, char *argv[])
dependencies = false;
break;

+ case 't':
+ init_order = true;
+ break;
+
case 'W':
parse_checks_option(true, false, optarg);
break;
@@ -245,6 +252,13 @@ int main(int argc, char *argv[])
if (dependencies)
add_dependencies(bi);

+ if (init_order) {
+ if (of_init_build_order(bi->dt))
+ exit(2);
+ of_init_print_order(arg);
+ exit(0);
+ }
+
if (streq(outname, "-")) {
outf = stdout;
} else {
@@ -266,5 +280,13 @@ int main(int argc, char *argv[])
die("Unknown output format \"%s\"\n", outform);
}

+ /*
+ * Check for cycles by building the initialzation order.
+ * This is done after the output was saved because it
+ * changes the tree slightly.
+ */
+ if (of_init_build_order(bi->dt))
+ exit(2);
+
exit(0);
}
diff --git a/scripts/dtc/dtc.h b/scripts/dtc/dtc.h
index c3dbeac..b89e08a 100644
--- a/scripts/dtc/dtc.h
+++ b/scripts/dtc/dtc.h
@@ -269,5 +269,7 @@ struct boot_info *dt_from_fs(const char *dirname);

/* Dependencies */
void add_dependencies(struct boot_info *bi);
+void of_init_print_order(const char *name);
+int of_init_build_order(struct node *root);

#endif /* _DTC_H */
--
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alexander Holler
2014-05-12 22:58:07 UTC
Permalink
So, this should be rebased on the actual DTC repository,
and not the kernel's copy of the DTC under scripts.
I have patches for the standalone dtc too. I just didn't want to poste
almost the same patches twice and wanted to wait for feedback if people
do like this stuff at all.

Do you already want those 3 patches for the standalone dtc?

Regards,

Alexander Holler

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alexander Holler
2014-05-12 16:47:55 UTC
Permalink
Add option -T do print a dependency graph in dot format for
generating a picture with Graphviz.

E.g.

dtc -T foo.dts | dot -T svg -o foo.svg

would generate the picture foo.png with the dependency graph.

Convential dependencies (those based on the tree structure) are having
black arrows, dependencies based on the property 'dependencies' are
having cyan arrows.

Option -D to not automatically add dependencies does still work, so
you could build a classic dependency graph with

dtc -D -T foo.dts | dot -T png -o foo_no_auto_deps.png

This works with binary blobs as input too. E.g.

CROSS_COMPILE=gcc-foo ARCH=arm make foo.dtb
scripts/dtc/dtc -I dtb -T arch/arm/boot/dts/foo.dtb

would print the dot file.

Signed-off-by: Alexander Holler <***@ahsoftware.de>
---
scripts/dtc/dependencies.c | 48 ++++++++++++++++++++++++++++++++++++++++------
scripts/dtc/dtc.c | 19 +++++++++++++++---
scripts/dtc/dtc.h | 2 +-
3 files changed, 59 insertions(+), 10 deletions(-)

diff --git a/scripts/dtc/dependencies.c b/scripts/dtc/dependencies.c
index 8fe1a8c..4579f6f 100644
--- a/scripts/dtc/dependencies.c
+++ b/scripts/dtc/dependencies.c
@@ -298,7 +298,7 @@ static void __init topological_sort(void)
depth_first_search(i);
}

-static int __init add_dep_list(struct device_node *node)
+static int __init add_dep_list(struct device_node *node, bool print_dot)
{
const __be32 *list, *list_end;
uint32_t ph;
@@ -328,6 +328,9 @@ static int __init add_dep_list(struct device_node *node)
rc = insert_edge(node->phandle, ph);
if (rc)
break;
+ if (print_dot)
+ printf(" node0x%x -> node0x%x [color=cyan]\n",
+ node->phandle, ph);
}

return rc;
@@ -352,9 +355,10 @@ static const char *of_prop_next_string(struct property *prop, const char *cur)
}

static int __init add_deps_lnx(struct device_node *parent,
- struct device_node *node)
+ struct device_node *node, bool print_dot)
{
struct device_node *child;
+ const char *cp;
int rc = 0;

if (!__of_device_is_available(node))
@@ -375,13 +379,33 @@ static int __init add_deps_lnx(struct device_node *parent,
return -EINVAL;
}
order.parent_by_phandle[node->phandle] = parent->phandle;
- rc = add_dep_list(node);
+ if (print_dot) {
+ struct property *prop;
+ printf(" node0x%x [label=\"0x%x %s", node->phandle,
+ node->phandle, node->name);
+ if (node->full_name)
+ printf(" (%s)", node->full_name);
+ prop = get_property(node, "compatible");
+ if (prop) {
+ printf("\\n");
+ for (cp = of_prop_next_string(prop, NULL); cp;
+ cp = of_prop_next_string(prop, cp)) {
+ if (cp != prop->val.val)
+ putchar(' ');
+ printf("%s", cp);
+ }
+ }
+ printf("\"];\n");
+ printf(" node0x%x -> node0x%x\n", node->phandle,
+ parent->phandle);
+ }
+ rc = add_dep_list(node, print_dot);
if (unlikely(rc))
return rc;
parent = node; /* change the parent only if node is a driver */
}
for_each_child_of_node(node, child) {
- rc = add_deps_lnx(parent, child);
+ rc = add_deps_lnx(parent, child, print_dot);
if (unlikely(rc))
break;
}
@@ -426,7 +450,7 @@ void __init of_init_print_order(const char *name)
}
}

-int __init of_init_build_order(struct device_node *root)
+int __init of_init_build_order(struct device_node *root, const char *print_dot)
{
struct device_node *child;
int rc = 0;
@@ -438,12 +462,24 @@ int __init of_init_build_order(struct device_node *root)
calc_max_phandle(root);
order.old_max_phandle = order.max_phandle;

+ if (print_dot) {
+ printf("digraph G {\n");
+ printf(" node0x%x [label=\"0x%x root (/)\"];\n",
+ order.max_phandle+1 , order.max_phandle+1);
+ }
+
for_each_child_of_node(root, child) {
- rc = add_deps_lnx(root, child);
+ rc = add_deps_lnx(root, child, print_dot);
if (unlikely(rc))
break;
}

+ if (print_dot) {
+ printf(" graph [label=\"Dependency Graph for %s (%u nodes, %u edges)\"];\n",
+ print_dot, graph.nvertices, graph.nedges);
+ printf("}\n");
+ }
+
of_node_put(root);
topological_sort();

diff --git a/scripts/dtc/dtc.c b/scripts/dtc/dtc.c
index ac9858c..ff09fc44 100644
--- a/scripts/dtc/dtc.c
+++ b/scripts/dtc/dtc.c
@@ -90,6 +90,8 @@ static void __attribute__ ((noreturn)) usage(void)
fprintf(stderr, "\t\tDo not automatically add dependencies for phandle references\n");
fprintf(stderr, "\t-t\n");
fprintf(stderr, "\t\tPrint (default) initialization order\n");
+ fprintf(stderr, "\t-T\n");
+ fprintf(stderr, "\t\tPrint dot with dependency graph (for use with Graphviz)\n");
fprintf(stderr, "\t-v\n");
fprintf(stderr, "\t\tPrint DTC version and exit\n");
fprintf(stderr, "\t-H <phandle format>\n");
@@ -113,6 +115,7 @@ int main(int argc, char *argv[])
int force = 0, sort = 0;
int dependencies = 1;
int init_order = 0;
+ int print_dot = 0;
const char *arg;
int opt;
FILE *outf = NULL;
@@ -124,7 +127,7 @@ int main(int argc, char *argv[])
minsize = 0;
padsize = 0;

- while ((opt = getopt(argc, argv, "hI:O:o:V:d:R:S:p:fqb:i:vH:sDtW:E:"))
+ while ((opt = getopt(argc, argv, "hI:O:o:V:d:R:S:p:fqb:i:vH:sDtTW:E:"))
!= EOF) {
switch (opt) {
case 'I':
@@ -190,6 +193,10 @@ int main(int argc, char *argv[])
init_order = true;
break;

+ case 'T':
+ print_dot = true;
+ break;
+
case 'W':
parse_checks_option(true, false, optarg);
break;
@@ -253,12 +260,18 @@ int main(int argc, char *argv[])
add_dependencies(bi);

if (init_order) {
- if (of_init_build_order(bi->dt))
+ if (of_init_build_order(bi->dt, 0))
exit(2);
of_init_print_order(arg);
exit(0);
}

+ if (print_dot) {
+ if (of_init_build_order(bi->dt, arg))
+ exit(2);
+ exit(0);
+ }
+
if (streq(outname, "-")) {
outf = stdout;
} else {
@@ -285,7 +298,7 @@ int main(int argc, char *argv[])
* This is done after the output was saved because it
* changes the tree slightly.
*/
- if (of_init_build_order(bi->dt))
+ if (of_init_build_order(bi->dt, 0))
exit(2);

exit(0);
diff --git a/scripts/dtc/dtc.h b/scripts/dtc/dtc.h
index b89e08a..b65afc2 100644
--- a/scripts/dtc/dtc.h
+++ b/scripts/dtc/dtc.h
@@ -270,6 +270,6 @@ struct boot_info *dt_from_fs(const char *dirname);
/* Dependencies */
void add_dependencies(struct boot_info *bi);
void of_init_print_order(const char *name);
-int of_init_build_order(struct node *root);
+int of_init_build_order(struct node *root, const char *print_dot);

#endif /* _DTC_H */
--
1.8.3.1
Alexander Holler
2014-05-12 16:47:56 UTC
Permalink
The init system currently calls unknown functions with almost unknown
functionality in an almost random order.

Fixing this is on a short-term basis is a bit tricky.

In order to register drivers with a deterministic order, a list of all
available in-kernel drivers is needed. Unfortunately such a list doesn't
exist, just a list of initcalls does exist.

The trick now is to first call special annotated initcalls (I call those
"well done") for platform drivers, but not actualy starting those drivers
by calling their probe function, but just collectiong their meta datas
(struct platform_driver). After all those informations were collected,
available the drivers will be started according to the previously
determined order.

The annotation of such platform drivers is necessary because it must be
made sure that those drivers don't care if the probe is actually called in
their initcall or later.

That means that all platform drivers which already do use

module_platform_driver() or
module_platform_driver_probe()

don't need any modification because their initcall is known and already well
done. But all platform drivers which do use

module_init() or
*_initcall()

have to be reviewed if they are "well done". If they are, they need a change
like

-module_init(foo_init);
+well_done_platform_module_init(foo_init);

or

-subsys_initcall(foo_init);
+well_done_platform_initcall(subsys, foo_init);

to become included in the deterministic order in which platform drivers
will be initialized.

All other platform drivers will still be initialized in random order before
platform drivers included in the deterministic order will be initialized.
"Well done" drivers which don't appear in the order (because they don't appear
in the DT) will be initialized after those which do appear in the order.

If CONFIG_OF_DEPENDENCIES is disabled, nothing is changed at all.

The long range target to fix the problem should be to include a list (array)
of struct platform_driver in the kernel for all in-kernel platform drivers,
instead of just initcalls. This will be easy if all platform drivers have
become "well done".

Unfortunately there are some drivers which will need quiet some changes
to become "well done". As an example for such an initcall look e.g. at
drivers/tty/serial/8250/8250_core.c.

Signed-off-by: Alexander Holler <holler-SXC+***@public.gmane.org>
---
drivers/base/platform.c | 13 +++++++
drivers/of/of_dependencies.c | 79 +++++++++++++++++++++++++++++++++++++++
include/asm-generic/vmlinux.lds.h | 1 +
include/linux/init.h | 19 ++++++++++
include/linux/of_dependencies.h | 5 +++
include/linux/platform_device.h | 16 ++++++--
init/main.c | 17 ++++++++-
7 files changed, 145 insertions(+), 5 deletions(-)

diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index bc78848..b9c9b33 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -13,6 +13,7 @@
#include <linux/string.h>
#include <linux/platform_device.h>
#include <linux/of_device.h>
+#include <linux/of_dependencies.h>
#include <linux/module.h>
#include <linux/init.h>
#include <linux/dma-mapping.h>
@@ -541,6 +542,12 @@ int __platform_driver_register(struct platform_driver *drv,
if (drv->shutdown)
drv->driver.shutdown = platform_drv_shutdown;

+#ifdef CONFIG_OF_DEPENDENCIES
+ if (of_init_is_recording())
+ /* Just record the driver */
+ return of_init_register_platform_driver(drv);
+ else
+#endif
return driver_register(&drv->driver);
}
EXPORT_SYMBOL_GPL(__platform_driver_register);
@@ -590,8 +597,14 @@ int __init_or_module platform_driver_probe(struct platform_driver *drv,

/* temporary section violation during probe() */
drv->probe = probe;
+
retval = code = platform_driver_register(drv);

+#ifdef CONFIG_OF_DEPENDENCIES
+ if (of_init_is_recording())
+ /* Just record the driver */
+ return retval;
+#endif
/*
* Fixup that section violation, being paranoid about code scanning
* the list of drivers in order to probe new devices. Check to see
diff --git a/drivers/of/of_dependencies.c b/drivers/of/of_dependencies.c
index 7905172..4af62d5 100644
--- a/drivers/of/of_dependencies.c
+++ b/drivers/of/of_dependencies.c
@@ -46,9 +46,12 @@ struct init_order {
/* Used to keep track of parent devices in regard to the DT */
uint32_t parent_by_phandle[MAX_DT_NODES+1];
struct device *device_by_phandle[MAX_DT_NODES+1];
+ struct platform_driver *platform_drivers[MAX_DT_NODES+1];
+ unsigned count_drivers;
};
static struct init_order order __initdata;

+static bool is_recording;

/* Copied from drivers/of/base.c (because it's lockless). */
static struct property * __init __of_find_property(const struct device_node *np,
@@ -401,3 +404,79 @@ void __init of_init_free_order(void)
order.count = 0;
/* remove_new_phandles(); */
}
+
+void __init of_init_set_recording(bool recording)
+{
+ is_recording = recording;
+}
+
+bool of_init_is_recording(void)
+{
+ return is_recording;
+}
+
+int of_init_register_platform_driver(struct platform_driver *drv)
+{
+ BUG_ON(!is_recording);
+ order.platform_drivers[order.count_drivers++] = drv;
+ return 0;
+}
+
+void __init of_init_register_drivers(void)
+{
+ unsigned i, j;
+ int rc __maybe_unused;
+
+ BUG_ON(is_recording);
+ /*
+ * Because we already have a list of devices and drivers together
+ * with their compatible strings, the below code could be speed up
+ * by replacing the functions which are walking through lists with
+ * something which uses trees or hashes to compare/search strings.
+ * These are of_driver_match_device() and driver_find() (the latter
+ * is called again in driver_register().
+ */
+ for (i = 0; i < order.count; ++i) {
+ struct device_node *node = order.order[i];
+ struct device *dev = order.device_by_phandle[node->phandle];
+
+ for (j = 0; j < order.count_drivers; ++j) {
+ struct platform_driver *drv = order.platform_drivers[j];
+
+ if (of_driver_match_device(dev, &drv->driver)) {
+ if (!driver_find(drv->driver.name,
+ drv->driver.bus))
+ platform_driver_register(drv);
+ if (dev->parent)
+ device_lock(dev->parent);
+ rc = device_attach(dev);
+ if (dev->parent)
+ device_unlock(dev->parent);
+ break;
+ }
+ }
+ if (j >= order.count_drivers) {
+ /*
+ * No driver in the initialization order matched,
+ * try to attach the device, maybe a driver already
+ * exists (e.g. loaded pre-smp).
+ */
+ if (dev->parent)
+ device_lock(dev->parent);
+ rc = device_attach(dev);
+ if (dev->parent)
+ device_unlock(dev->parent);
+ }
+ }
+ /*
+ * Now just register all drivers, including those not used through
+ * the initialization order (well-done drivers which aren't listed
+ * in the DT or blacklisted through of_init_create_devices()).
+ */
+ for (j = 0; j < order.count_drivers; ++j) {
+ struct platform_driver *drv = order.platform_drivers[j];
+
+ if (!driver_find(drv->driver.name, drv->driver.bus))
+ platform_driver_register(drv);
+ }
+}
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index bc2121f..cedb3b0 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -633,6 +633,7 @@
INIT_CALLS_LEVEL(rootfs) \
INIT_CALLS_LEVEL(6) \
INIT_CALLS_LEVEL(7) \
+ INIT_CALLS_LEVEL(8) \
VMLINUX_SYMBOL(__initcall_end) = .;

#define CON_INITCALL \
diff --git a/include/linux/init.h b/include/linux/init.h
index e168880..acb7dfa 100644
--- a/include/linux/init.h
+++ b/include/linux/init.h
@@ -209,6 +209,23 @@ extern bool initcall_debug;
#define late_initcall(fn) __define_initcall(fn, 7)
#define late_initcall_sync(fn) __define_initcall(fn, 7s)

+/*
+ * A well_done_platform_module_init or well_done_platform_initcall
+ * only calls platform_driver_register() or platform_driver_probe()
+ * and ignores the return code. This is necessary because the
+ * actual calls to platform_driver_register() or platform_driver_probe()
+ * will be delayed when CONFIG_OF_DEPENDENCIES is enabled. This is done
+ * to sort those calls based on the dependencies in the DT (matched to the
+ * platform driver data).
+ */
+#ifdef CONFIG_OF_DEPENDENCIES
+#define well_done_platform_module_init(fn) __define_initcall(fn, 8)
+#define well_done_platform_initcall(leve, fn) __define_initcall(fn, 8)
+#else
+#define well_done_platform_module_init(fn) module_init(fn)
+#define well_done_platform_initcall(level, fn) level ## _initcall(fn)
+#endif
+
#define __initcall(fn) device_initcall(fn)

#define __exitcall(fn) \
@@ -289,6 +306,8 @@ void __init parse_early_options(char *cmdline);
#define rootfs_initcall(fn) module_init(fn)
#define device_initcall(fn) module_init(fn)
#define late_initcall(fn) module_init(fn)
+#define well_done_platform_initcall(fn) module_init(fn)
+#define well_done_platform_module_init(fn) module_init(fn)

#define console_initcall(fn) module_init(fn)
#define security_initcall(fn) module_init(fn)
diff --git a/include/linux/of_dependencies.h b/include/linux/of_dependencies.h
index e046ce2..8869162 100644
--- a/include/linux/of_dependencies.h
+++ b/include/linux/of_dependencies.h
@@ -58,4 +58,9 @@ extern void of_init_create_devices(const struct of_device_id *matches,
*/
extern void of_init_free_order(void);

+void of_init_set_recording(bool recording);
+bool of_init_is_recording(void);
+int of_init_register_platform_driver(struct platform_driver *drv);
+void of_init_register_drivers(void);
+
#endif /* _LINUX_OF_DEPENDENCIES_H */
diff --git a/include/linux/platform_device.h b/include/linux/platform_device.h
index 16f6654..b8559d9 100644
--- a/include/linux/platform_device.h
+++ b/include/linux/platform_device.h
@@ -215,9 +215,17 @@ static inline void platform_set_drvdata(struct platform_device *pdev,
* boilerplate. Each module may only use this macro once, and
* calling it replaces module_init() and module_exit()
*/
-#define module_platform_driver(__platform_driver) \
- module_driver(__platform_driver, platform_driver_register, \
- platform_driver_unregister)
+#define module_platform_driver(__driver) \
+static int __init __driver##_init(void) \
+{ \
+ return platform_driver_register(&(__driver)); \
+} \
+well_done_platform_module_init(__driver##_init); \
+static void __exit __driver##_exit(void) \
+{ \
+ platform_driver_unregister(&(__driver)); \
+} \
+module_exit(__driver##_exit);

/* module_platform_driver_probe() - Helper macro for drivers that don't do
* anything special in module init/exit. This eliminates a lot of
@@ -230,7 +238,7 @@ static int __init __platform_driver##_init(void) \
return platform_driver_probe(&(__platform_driver), \
__platform_probe); \
} \
-module_init(__platform_driver##_init); \
+well_done_platform_module_init(__platform_driver##_init); \
static void __exit __platform_driver##_exit(void) \
{ \
platform_driver_unregister(&(__platform_driver)); \
diff --git a/init/main.c b/init/main.c
index 9c7fd4c..7591cd1 100644
--- a/init/main.c
+++ b/init/main.c
@@ -77,6 +77,7 @@
#include <linux/sched_clock.h>
#include <linux/context_tracking.h>
#include <linux/random.h>
+#include <linux/of_dependencies.h>

#include <asm/io.h>
#include <asm/bugs.h>
@@ -720,6 +721,7 @@ extern initcall_t __initcall4_start[];
extern initcall_t __initcall5_start[];
extern initcall_t __initcall6_start[];
extern initcall_t __initcall7_start[];
+extern initcall_t __initcall8_start[];
extern initcall_t __initcall_end[];

static initcall_t *initcall_levels[] __initdata = {
@@ -731,6 +733,7 @@ static initcall_t *initcall_levels[] __initdata = {
__initcall5_start,
__initcall6_start,
__initcall7_start,
+ __initcall8_start,
__initcall_end,
};

@@ -744,6 +747,8 @@ static char *initcall_level_names[] __initdata = {
"fs",
"device",
"late",
+ /* must be the last level to become excluded in do_initcalls() */
+ "well-done-platform-driver",
};

static void __init do_initcall_level(int level)
@@ -766,7 +771,7 @@ static void __init do_initcalls(void)
{
int level;

- for (level = 0; level < ARRAY_SIZE(initcall_levels) - 1; level++)
+ for (level = 0; level < ARRAY_SIZE(initcall_levels) - 3; level++)
do_initcall_level(level);
}

@@ -787,6 +792,16 @@ static void __init do_basic_setup(void)
do_ctors();
usermodehelper_enable();
do_initcalls();
+#ifdef CONFIG_OF_DEPENDENCIES
+ /* collect a list of available platform drivers */
+ of_init_set_recording(true);
+ do_initcall_level(ARRAY_SIZE(initcall_levels) - 2);
+ of_init_set_recording(false);
+ /* probe available platform drivers with deterministic order */
+ of_init_register_drivers();
+ /* register late drivers */
+ do_initcall_level(ARRAY_SIZE(initcall_levels) - 3);
+#endif
random_int_secret_init();
}
--
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Grant Likely
2014-05-14 14:13:00 UTC
Permalink
Post by Alexander Holler
The init system currently calls unknown functions with almost unknown
functionality in an almost random order.
Correct, we've got a module system. Some would say that is a strength!
:-) That said, I don't object to optimizing to the optimal order when
possible.
Post by Alexander Holler
Fixing this is on a short-term basis is a bit tricky.
In order to register drivers with a deterministic order, a list of all
available in-kernel drivers is needed. Unfortunately such a list doesn't
exist, just a list of initcalls does exist.
The trick now is to first call special annotated initcalls (I call those
"well done") for platform drivers, but not actualy starting those drivers
by calling their probe function, but just collectiong their meta datas
(struct platform_driver). After all those informations were collected,
available the drivers will be started according to the previously
determined order.
Why does the initcall level matter? Why not just let the initcalls
happen, capture the calls that register a driver, and then use that list
later?
Post by Alexander Holler
The annotation of such platform drivers is necessary because it must be
made sure that those drivers don't care if the probe is actually called in
their initcall or later.
That means that all platform drivers which already do use
module_platform_driver() or
module_platform_driver_probe()
don't need any modification because their initcall is known and already well
done. But all platform drivers which do use
module_init() or
*_initcall()
have to be reviewed if they are "well done". If they are, they need a change
like
-module_init(foo_init);
+well_done_platform_module_init(foo_init);
or
-subsys_initcall(foo_init);
+well_done_platform_initcall(subsys, foo_init);
to become included in the deterministic order in which platform drivers
will be initialized.
All other platform drivers will still be initialized in random order before
platform drivers included in the deterministic order will be initialized.
"Well done" drivers which don't appear in the order (because they don't appear
in the DT) will be initialized after those which do appear in the order.
If CONFIG_OF_DEPENDENCIES is disabled, nothing is changed at all.
The long range target to fix the problem should be to include a list (array)
of struct platform_driver in the kernel for all in-kernel platform drivers,
instead of just initcalls. This will be easy if all platform drivers have
become "well done".
How will that list be constructed? How will it account for multiple
platforms, each requiring a different init order?
Post by Alexander Holler
Unfortunately there are some drivers which will need quiet some changes
to become "well done". As an example for such an initcall look e.g. at
drivers/tty/serial/8250/8250_core.c.
---
drivers/base/platform.c | 13 +++++++
drivers/of/of_dependencies.c | 79 +++++++++++++++++++++++++++++++++++++++
include/asm-generic/vmlinux.lds.h | 1 +
include/linux/init.h | 19 ++++++++++
include/linux/of_dependencies.h | 5 +++
include/linux/platform_device.h | 16 ++++++--
init/main.c | 17 ++++++++-
7 files changed, 145 insertions(+), 5 deletions(-)
diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index bc78848..b9c9b33 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -13,6 +13,7 @@
#include <linux/string.h>
#include <linux/platform_device.h>
#include <linux/of_device.h>
+#include <linux/of_dependencies.h>
#include <linux/module.h>
#include <linux/init.h>
#include <linux/dma-mapping.h>
@@ -541,6 +542,12 @@ int __platform_driver_register(struct platform_driver *drv,
if (drv->shutdown)
drv->driver.shutdown = platform_drv_shutdown;
+#ifdef CONFIG_OF_DEPENDENCIES
+ if (of_init_is_recording())
+ /* Just record the driver */
+ return of_init_register_platform_driver(drv);
+ else
+#endif
return driver_register(&drv->driver);
}
EXPORT_SYMBOL_GPL(__platform_driver_register);
@@ -590,8 +597,14 @@ int __init_or_module platform_driver_probe(struct platform_driver *drv,
/* temporary section violation during probe() */
drv->probe = probe;
+
retval = code = platform_driver_register(drv);
+#ifdef CONFIG_OF_DEPENDENCIES
+ if (of_init_is_recording())
+ /* Just record the driver */
+ return retval;
+#endif
/*
* Fixup that section violation, being paranoid about code scanning
* the list of drivers in order to probe new devices. Check to see
diff --git a/drivers/of/of_dependencies.c b/drivers/of/of_dependencies.c
index 7905172..4af62d5 100644
--- a/drivers/of/of_dependencies.c
+++ b/drivers/of/of_dependencies.c
@@ -46,9 +46,12 @@ struct init_order {
/* Used to keep track of parent devices in regard to the DT */
uint32_t parent_by_phandle[MAX_DT_NODES+1];
struct device *device_by_phandle[MAX_DT_NODES+1];
+ struct platform_driver *platform_drivers[MAX_DT_NODES+1];
+ unsigned count_drivers;
};
static struct init_order order __initdata;
+static bool is_recording;
/* Copied from drivers/of/base.c (because it's lockless). */
static struct property * __init __of_find_property(const struct device_node *np,
@@ -401,3 +404,79 @@ void __init of_init_free_order(void)
order.count = 0;
/* remove_new_phandles(); */
}
+
+void __init of_init_set_recording(bool recording)
+{
+ is_recording = recording;
+}
+
+bool of_init_is_recording(void)
+{
+ return is_recording;
+}
+
+int of_init_register_platform_driver(struct platform_driver *drv)
+{
+ BUG_ON(!is_recording);
+ order.platform_drivers[order.count_drivers++] = drv;
+ return 0;
+}
+
+void __init of_init_register_drivers(void)
+{
+ unsigned i, j;
+ int rc __maybe_unused;
+
+ BUG_ON(is_recording);
+ /*
+ * Because we already have a list of devices and drivers together
+ * with their compatible strings, the below code could be speed up
+ * by replacing the functions which are walking through lists with
+ * something which uses trees or hashes to compare/search strings.
+ * These are of_driver_match_device() and driver_find() (the latter
+ * is called again in driver_register().
+ */
+ for (i = 0; i < order.count; ++i) {
+ struct device_node *node = order.order[i];
+ struct device *dev = order.device_by_phandle[node->phandle];
+
+ for (j = 0; j < order.count_drivers; ++j) {
+ struct platform_driver *drv = order.platform_drivers[j];
+
+ if (of_driver_match_device(dev, &drv->driver)) {
+ if (!driver_find(drv->driver.name,
+ drv->driver.bus))
+ platform_driver_register(drv);
+ if (dev->parent)
+ device_lock(dev->parent);
+ rc = device_attach(dev);
+ if (dev->parent)
+ device_unlock(dev->parent);
+ break;
+ }
+ }
+ if (j >= order.count_drivers) {
+ /*
+ * No driver in the initialization order matched,
+ * try to attach the device, maybe a driver already
+ * exists (e.g. loaded pre-smp).
+ */
+ if (dev->parent)
+ device_lock(dev->parent);
+ rc = device_attach(dev);
+ if (dev->parent)
+ device_unlock(dev->parent);
+ }
+ }
+ /*
+ * Now just register all drivers, including those not used through
+ * the initialization order (well-done drivers which aren't listed
+ * in the DT or blacklisted through of_init_create_devices()).
+ */
+ for (j = 0; j < order.count_drivers; ++j) {
+ struct platform_driver *drv = order.platform_drivers[j];
+
+ if (!driver_find(drv->driver.name, drv->driver.bus))
+ platform_driver_register(drv);
+ }
+}
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index bc2121f..cedb3b0 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -633,6 +633,7 @@
INIT_CALLS_LEVEL(rootfs) \
INIT_CALLS_LEVEL(6) \
INIT_CALLS_LEVEL(7) \
+ INIT_CALLS_LEVEL(8) \
VMLINUX_SYMBOL(__initcall_end) = .;
#define CON_INITCALL \
diff --git a/include/linux/init.h b/include/linux/init.h
index e168880..acb7dfa 100644
--- a/include/linux/init.h
+++ b/include/linux/init.h
@@ -209,6 +209,23 @@ extern bool initcall_debug;
#define late_initcall(fn) __define_initcall(fn, 7)
#define late_initcall_sync(fn) __define_initcall(fn, 7s)
+/*
+ * A well_done_platform_module_init or well_done_platform_initcall
+ * only calls platform_driver_register() or platform_driver_probe()
+ * and ignores the return code. This is necessary because the
+ * actual calls to platform_driver_register() or platform_driver_probe()
+ * will be delayed when CONFIG_OF_DEPENDENCIES is enabled. This is done
+ * to sort those calls based on the dependencies in the DT (matched to the
+ * platform driver data).
+ */
+#ifdef CONFIG_OF_DEPENDENCIES
+#define well_done_platform_module_init(fn) __define_initcall(fn, 8)
+#define well_done_platform_initcall(leve, fn) __define_initcall(fn, 8)
+#else
+#define well_done_platform_module_init(fn) module_init(fn)
+#define well_done_platform_initcall(level, fn) level ## _initcall(fn)
+#endif
+
#define __initcall(fn) device_initcall(fn)
#define __exitcall(fn) \
@@ -289,6 +306,8 @@ void __init parse_early_options(char *cmdline);
#define rootfs_initcall(fn) module_init(fn)
#define device_initcall(fn) module_init(fn)
#define late_initcall(fn) module_init(fn)
+#define well_done_platform_initcall(fn) module_init(fn)
+#define well_done_platform_module_init(fn) module_init(fn)
#define console_initcall(fn) module_init(fn)
#define security_initcall(fn) module_init(fn)
diff --git a/include/linux/of_dependencies.h b/include/linux/of_dependencies.h
index e046ce2..8869162 100644
--- a/include/linux/of_dependencies.h
+++ b/include/linux/of_dependencies.h
@@ -58,4 +58,9 @@ extern void of_init_create_devices(const struct of_device_id *matches,
*/
extern void of_init_free_order(void);
+void of_init_set_recording(bool recording);
+bool of_init_is_recording(void);
+int of_init_register_platform_driver(struct platform_driver *drv);
+void of_init_register_drivers(void);
+
#endif /* _LINUX_OF_DEPENDENCIES_H */
diff --git a/include/linux/platform_device.h b/include/linux/platform_device.h
index 16f6654..b8559d9 100644
--- a/include/linux/platform_device.h
+++ b/include/linux/platform_device.h
@@ -215,9 +215,17 @@ static inline void platform_set_drvdata(struct platform_device *pdev,
* boilerplate. Each module may only use this macro once, and
* calling it replaces module_init() and module_exit()
*/
-#define module_platform_driver(__platform_driver) \
- module_driver(__platform_driver, platform_driver_register, \
- platform_driver_unregister)
+#define module_platform_driver(__driver) \
+static int __init __driver##_init(void) \
+{ \
+ return platform_driver_register(&(__driver)); \
+} \
+well_done_platform_module_init(__driver##_init); \
+static void __exit __driver##_exit(void) \
+{ \
+ platform_driver_unregister(&(__driver)); \
+} \
+module_exit(__driver##_exit);
/* module_platform_driver_probe() - Helper macro for drivers that don't do
* anything special in module init/exit. This eliminates a lot of
@@ -230,7 +238,7 @@ static int __init __platform_driver##_init(void) \
return platform_driver_probe(&(__platform_driver), \
__platform_probe); \
} \
-module_init(__platform_driver##_init); \
+well_done_platform_module_init(__platform_driver##_init); \
static void __exit __platform_driver##_exit(void) \
{ \
platform_driver_unregister(&(__platform_driver)); \
diff --git a/init/main.c b/init/main.c
index 9c7fd4c..7591cd1 100644
--- a/init/main.c
+++ b/init/main.c
@@ -77,6 +77,7 @@
#include <linux/sched_clock.h>
#include <linux/context_tracking.h>
#include <linux/random.h>
+#include <linux/of_dependencies.h>
#include <asm/io.h>
#include <asm/bugs.h>
@@ -720,6 +721,7 @@ extern initcall_t __initcall4_start[];
extern initcall_t __initcall5_start[];
extern initcall_t __initcall6_start[];
extern initcall_t __initcall7_start[];
+extern initcall_t __initcall8_start[];
extern initcall_t __initcall_end[];
static initcall_t *initcall_levels[] __initdata = {
@@ -731,6 +733,7 @@ static initcall_t *initcall_levels[] __initdata = {
__initcall5_start,
__initcall6_start,
__initcall7_start,
+ __initcall8_start,
__initcall_end,
};
@@ -744,6 +747,8 @@ static char *initcall_level_names[] __initdata = {
"fs",
"device",
"late",
+ /* must be the last level to become excluded in do_initcalls() */
+ "well-done-platform-driver",
};
static void __init do_initcall_level(int level)
@@ -766,7 +771,7 @@ static void __init do_initcalls(void)
{
int level;
- for (level = 0; level < ARRAY_SIZE(initcall_levels) - 1; level++)
+ for (level = 0; level < ARRAY_SIZE(initcall_levels) - 3; level++)
do_initcall_level(level);
}
@@ -787,6 +792,16 @@ static void __init do_basic_setup(void)
do_ctors();
usermodehelper_enable();
do_initcalls();
+#ifdef CONFIG_OF_DEPENDENCIES
+ /* collect a list of available platform drivers */
+ of_init_set_recording(true);
+ do_initcall_level(ARRAY_SIZE(initcall_levels) - 2);
+ of_init_set_recording(false);
+ /* probe available platform drivers with deterministic order */
+ of_init_register_drivers();
+ /* register late drivers */
+ do_initcall_level(ARRAY_SIZE(initcall_levels) - 3);
+#endif
random_int_secret_init();
}
--
1.8.3.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Alexander Holler
2014-05-14 14:58:30 UTC
Permalink
Post by Grant Likely
Post by Alexander Holler
The init system currently calls unknown functions with almost unknown
functionality in an almost random order.
Correct, we've got a module system. Some would say that is a strength!
:-) That said, I don't object to optimizing to the optimal order when
possible.
Modules do work after init happened, that isn't what this feature is about.
Post by Grant Likely
Post by Alexander Holler
Fixing this is on a short-term basis is a bit tricky.
In order to register drivers with a deterministic order, a list of all
available in-kernel drivers is needed. Unfortunately such a list doesn't
exist, just a list of initcalls does exist.
The trick now is to first call special annotated initcalls (I call those
"well done") for platform drivers, but not actualy starting those drivers
by calling their probe function, but just collectiong their meta datas
(struct platform_driver). After all those informations were collected,
available the drivers will be started according to the previously
determined order.
Why does the initcall level matter? Why not just let the initcalls
happen, capture the calls that register a driver, and then use that list
later?
Some initcalls assume that stuff is present when they called probe or
register and do further action based on the rc code.
Post by Grant Likely
Post by Alexander Holler
The long range target to fix the problem should be to include a list (array)
of struct platform_driver in the kernel for all in-kernel platform drivers,
instead of just initcalls. This will be easy if all platform drivers have
become "well done".
How will that list be constructed? How will it account for multiple
platforms, each requiring a different init order?
The list could be build just like the list of initcalls, but containing
structs platform instead of function pointers.

The order is in now way part of this list, after all that's what this
feature is about. The order is determined by metadatas in the DT, to get
rid of a lot of otherwise necessary hardcoded stuff to fix the order in
drivers.

Regards,

Alexander Holler
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Grant Likely
2014-05-14 19:32:50 UTC
Permalink
Post by Alexander Holler
On Mon, 12 May 2014 18:47:56 +0200, Alexander Holler
Post by Alexander Holler
The init system currently calls unknown functions with almost unknown
functionality in an almost random order.
Correct, we've got a module system. Some would say that is a strength!
:-) That said, I don't object to optimizing to the optimal order when
possible.
Modules do work after init happened, that isn't what this feature is about.
Oops, I meant modular. I wasn't talking about modules either. The
driver model is designed to match devices with drivers regardless of
the order that either of them get registered to the system. I think
that is a strong aspect of the drivercore. What it doesn't have is any
way of optimizing the probe order, which is at the heart of your
proposal.
Post by Alexander Holler
Post by Alexander Holler
Fixing this is on a short-term basis is a bit tricky.
In order to register drivers with a deterministic order, a list of all
available in-kernel drivers is needed. Unfortunately such a list doesn't
exist, just a list of initcalls does exist.
The trick now is to first call special annotated initcalls (I call those
"well done") for platform drivers, but not actualy starting those drivers
by calling their probe function, but just collectiong their meta datas
(struct platform_driver). After all those informations were collected,
available the drivers will be started according to the previously
determined order.
Why does the initcall level matter? Why not just let the initcalls
happen, capture the calls that register a driver, and then use that list
later?
Some initcalls assume that stuff is present when they called probe or
register and do further action based on the rc code.
What I mean is that manipulating the initcall level isn't the best way
to handle it. We've got enough initcalls and there isn't a need to add
more. Other ways to handle the problem would be to either have a
variant of the platform_driver_register() that triggers your desired
behavour, or add a flag to the struct device_driver that tells the
driver core that it should try to resolve ordering. In both cases the
module_platform_driver() macro can do the magic bit. Other drivers
will have to do it manually.

g.
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alexander Holler
2014-05-12 16:47:57 UTC
Permalink
This patch contains the necessary changes for some drivers which are used
by the boards I have. The list isn't complete (therefor the WIP) and is
meant as an example. If considered to be mainlined, I assume these changes
should end up in one patch for every changed driver.

Signed-off-by: Alexander Holler <holler-SXC+***@public.gmane.org>
---
drivers/dma/mv_xor.c | 2 +-
drivers/dma/omap-dma.c | 2 +-
drivers/gpio/gpio-mvebu.c | 2 +-
drivers/gpio/gpio-twl4030.c | 2 +-
drivers/i2c/busses/i2c-omap.c | 2 +-
drivers/iommu/omap-iommu.c | 2 +-
drivers/mailbox/mailbox-omap2.c | 2 +-
drivers/net/ethernet/marvell/mv643xx_eth.c | 2 +-
drivers/regulator/fixed.c | 2 +-
drivers/regulator/twl-regulator.c | 2 +-
drivers/usb/host/ehci-omap.c | 2 +-
drivers/usb/host/ehci-orion.c | 2 +-
drivers/usb/host/ohci-omap3.c | 2 +-
drivers/usb/phy/phy-generic.c | 2 +-
14 files changed, 14 insertions(+), 14 deletions(-)

diff --git a/drivers/dma/mv_xor.c b/drivers/dma/mv_xor.c
index 766b68e..7f1091a 100644
--- a/drivers/dma/mv_xor.c
+++ b/drivers/dma/mv_xor.c
@@ -1316,7 +1316,7 @@ static int __init mv_xor_init(void)
{
return platform_driver_register(&mv_xor_driver);
}
-module_init(mv_xor_init);
+well_done_platform_module_init(mv_xor_init);

/* it's currently unsafe to unload this module */
#if 0
diff --git a/drivers/dma/omap-dma.c b/drivers/dma/omap-dma.c
index 362e7c4..a523025 100644
--- a/drivers/dma/omap-dma.c
+++ b/drivers/dma/omap-dma.c
@@ -707,7 +707,7 @@ static int omap_dma_init(void)
{
return platform_driver_register(&omap_dma_driver);
}
-subsys_initcall(omap_dma_init);
+well_done_platform_initcall(subsys, omap_dma_init);

static void __exit omap_dma_exit(void)
{
diff --git a/drivers/gpio/gpio-mvebu.c b/drivers/gpio/gpio-mvebu.c
index 3b1fd1c..c151f6e 100644
--- a/drivers/gpio/gpio-mvebu.c
+++ b/drivers/gpio/gpio-mvebu.c
@@ -736,4 +736,4 @@ static int __init mvebu_gpio_init(void)
{
return platform_driver_register(&mvebu_gpio_driver);
}
-postcore_initcall(mvebu_gpio_init);
+well_done_platform_initcall(postcore, mvebu_gpio_init);
diff --git a/drivers/gpio/gpio-twl4030.c b/drivers/gpio/gpio-twl4030.c
index 8b88ca2..6c18f4a 100644
--- a/drivers/gpio/gpio-twl4030.c
+++ b/drivers/gpio/gpio-twl4030.c
@@ -618,7 +618,7 @@ static int __init gpio_twl4030_init(void)
{
return platform_driver_register(&gpio_twl4030_driver);
}
-subsys_initcall(gpio_twl4030_init);
+well_done_platform_initcall(subsys, gpio_twl4030_init);

static void __exit gpio_twl4030_exit(void)
{
diff --git a/drivers/i2c/busses/i2c-omap.c b/drivers/i2c/busses/i2c-omap.c
index 90dcc2e..4df05c0 100644
--- a/drivers/i2c/busses/i2c-omap.c
+++ b/drivers/i2c/busses/i2c-omap.c
@@ -1352,7 +1352,7 @@ omap_i2c_init_driver(void)
{
return platform_driver_register(&omap_i2c_driver);
}
-subsys_initcall(omap_i2c_init_driver);
+well_done_platform_initcall(subsys, omap_i2c_init_driver);

static void __exit omap_i2c_exit_driver(void)
{
diff --git a/drivers/iommu/omap-iommu.c b/drivers/iommu/omap-iommu.c
index bcd78a7..c121708 100644
--- a/drivers/iommu/omap-iommu.c
+++ b/drivers/iommu/omap-iommu.c
@@ -1282,7 +1282,7 @@ static int __init omap_iommu_init(void)
return platform_driver_register(&omap_iommu_driver);
}
/* must be ready before omap3isp is probed */
-subsys_initcall(omap_iommu_init);
+well_done_platform_initcall(subsys, omap_iommu_init);

static void __exit omap_iommu_exit(void)
{
diff --git a/drivers/mailbox/mailbox-omap2.c b/drivers/mailbox/mailbox-omap2.c
index 42d2b89..919da67 100644
--- a/drivers/mailbox/mailbox-omap2.c
+++ b/drivers/mailbox/mailbox-omap2.c
@@ -347,7 +347,7 @@ static void __exit omap2_mbox_exit(void)
platform_driver_unregister(&omap2_mbox_driver);
}

-module_init(omap2_mbox_init);
+well_done_platform_module_init(omap2_mbox_init);
module_exit(omap2_mbox_exit);

MODULE_LICENSE("GPL v2");
diff --git a/drivers/net/ethernet/marvell/mv643xx_eth.c b/drivers/net/ethernet/marvell/mv643xx_eth.c
index a2565ce..2fcd832 100644
--- a/drivers/net/ethernet/marvell/mv643xx_eth.c
+++ b/drivers/net/ethernet/marvell/mv643xx_eth.c
@@ -3011,7 +3011,7 @@ static int __init mv643xx_eth_init_module(void)

return rc;
}
-module_init(mv643xx_eth_init_module);
+well_done_platform_module_init(mv643xx_eth_init_module);

static void __exit mv643xx_eth_cleanup_module(void)
{
diff --git a/drivers/regulator/fixed.c b/drivers/regulator/fixed.c
index 5ea64b9..3a71016 100644
--- a/drivers/regulator/fixed.c
+++ b/drivers/regulator/fixed.c
@@ -241,7 +241,7 @@ static int __init regulator_fixed_voltage_init(void)
{
return platform_driver_register(&regulator_fixed_voltage_driver);
}
-subsys_initcall(regulator_fixed_voltage_init);
+well_done_platform_initcall(subsys, regulator_fixed_voltage_init);

static void __exit regulator_fixed_voltage_exit(void)
{
diff --git a/drivers/regulator/twl-regulator.c b/drivers/regulator/twl-regulator.c
index fed28ab..646caf6 100644
--- a/drivers/regulator/twl-regulator.c
+++ b/drivers/regulator/twl-regulator.c
@@ -1240,7 +1240,7 @@ static int __init twlreg_init(void)
{
return platform_driver_register(&twlreg_driver);
}
-subsys_initcall(twlreg_init);
+well_done_platform_initcall(subsys, twlreg_init);

static void __exit twlreg_exit(void)
{
diff --git a/drivers/usb/host/ehci-omap.c b/drivers/usb/host/ehci-omap.c
index a24720b..19bca40 100644
--- a/drivers/usb/host/ehci-omap.c
+++ b/drivers/usb/host/ehci-omap.c
@@ -310,7 +310,7 @@ static int __init ehci_omap_init(void)
ehci_init_driver(&ehci_omap_hc_driver, &ehci_omap_overrides);
return platform_driver_register(&ehci_hcd_omap_driver);
}
-module_init(ehci_omap_init);
+well_done_platform_module_init(ehci_omap_init);

static void __exit ehci_omap_cleanup(void)
{
diff --git a/drivers/usb/host/ehci-orion.c b/drivers/usb/host/ehci-orion.c
index 30d35e5..6d59923 100644
--- a/drivers/usb/host/ehci-orion.c
+++ b/drivers/usb/host/ehci-orion.c
@@ -298,7 +298,7 @@ static int __init ehci_orion_init(void)
ehci_init_driver(&ehci_orion_hc_driver, NULL);
return platform_driver_register(&ehci_orion_driver);
}
-module_init(ehci_orion_init);
+well_done_platform_module_init(ehci_orion_init);

static void __exit ehci_orion_cleanup(void)
{
diff --git a/drivers/usb/host/ohci-omap3.c b/drivers/usb/host/ohci-omap3.c
index ec15aeb..26bb8d1 100644
--- a/drivers/usb/host/ohci-omap3.c
+++ b/drivers/usb/host/ohci-omap3.c
@@ -197,7 +197,7 @@ static int __init ohci_omap3_init(void)
ohci_init_driver(&ohci_omap3_hc_driver, NULL);
return platform_driver_register(&ohci_hcd_omap3_driver);
}
-module_init(ohci_omap3_init);
+well_done_module_init(ohci_omap3_init);

static void __exit ohci_omap3_cleanup(void)
{
diff --git a/drivers/usb/phy/phy-generic.c b/drivers/usb/phy/phy-generic.c
index bb39498..6405efe 100644
--- a/drivers/usb/phy/phy-generic.c
+++ b/drivers/usb/phy/phy-generic.c
@@ -304,7 +304,7 @@ static int __init usb_phy_gen_xceiv_init(void)
{
return platform_driver_register(&usb_phy_gen_xceiv_driver);
}
-subsys_initcall(usb_phy_gen_xceiv_init);
+well_done_platform_initcall(subsys, usb_phy_gen_xceiv_init);

static void __exit usb_phy_gen_xceiv_exit(void)
{
--
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alexander Holler
2014-05-12 16:47:58 UTC
Permalink
Use the feature of dependency based initialization order for drivers
if CONFIG_OF_DEPENDENCIES is enabled.

Signed-off-by: Alexander Holler <***@ahsoftware.de>
---
arch/arm/mach-kirkwood/board-dt.c | 8 ++++++++
1 file changed, 8 insertions(+)

diff --git a/arch/arm/mach-kirkwood/board-dt.c b/arch/arm/mach-kirkwood/board-dt.c
index 7818815..5c352b7 100644
--- a/arch/arm/mach-kirkwood/board-dt.c
+++ b/arch/arm/mach-kirkwood/board-dt.c
@@ -17,6 +17,7 @@
#include <linux/of_address.h>
#include <linux/of_net.h>
#include <linux/of_platform.h>
+#include <linux/of_dependencies.h>
#include <linux/dma-mapping.h>
#include <linux/irqchip.h>
#include <linux/kexec.h>
@@ -133,7 +134,14 @@ static void __init kirkwood_dt_init(void)
if (of_machine_is_compatible("marvell,mv88f6281gtw-ge"))
mv88f6281gtw_ge_init();

+#ifdef CONFIG_OF_DEPENDENCIES
+ if (!of_init_build_order(NULL, NULL))
+ of_init_create_devices(NULL, NULL);
+ else
+ of_init_free_order();
+#else
of_platform_populate(NULL, of_default_bus_match_table, NULL, NULL);
+#endif
}

static const char * const kirkwood_dt_board_compat[] = {
--
1.8.3.1
Alexander Holler
2014-05-12 16:47:59 UTC
Permalink
This serves as an example how easy it is to fix an initialization order
if the order depends on the DT. No source code changes will be necessary.

If you look at the dependency graph for the dockstar, you will see that
there is no dependency between ehci and the usb power regulator. This
ends up with the fact that the regulator will be initialized after ehci.

Fix this by adding one dependency to the .dts.

Signed-off-by: Alexander Holler <***@ahsoftware.de>
---
arch/arm/boot/dts/kirkwood-dockstar.dts | 4 ++++
arch/arm/boot/dts/kirkwood.dtsi | 2 +-
2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/kirkwood-dockstar.dts b/arch/arm/boot/dts/kirkwood-dockstar.dts
index f31312e..0ca7456 100644
--- a/arch/arm/boot/dts/kirkwood-dockstar.dts
+++ b/arch/arm/boot/dts/kirkwood-dockstar.dts
@@ -106,3 +106,7 @@
phy-handle = <&ethphy0>;
};
};
+
+&ehci {
+ dependencies = <&usb_power>;
+};
diff --git a/arch/arm/boot/dts/kirkwood.dtsi b/arch/arm/boot/dts/kirkwood.dtsi
index 6abf44d..54a1223 100644
--- a/arch/arm/boot/dts/kirkwood.dtsi
+++ b/arch/arm/boot/dts/kirkwood.dtsi
@@ -185,7 +185,7 @@
status = "okay";
};

- ***@50000 {
+ ehci: ***@50000 {
compatible = "marvell,orion-ehci";
reg = <0x50000 0x1000>;
interrupts = <19>;
--
1.8.3.1
Alexander Holler
2014-05-12 16:48:00 UTC
Permalink
Use the feature of dependency based initialization order for drivers
if CONFIG_OF_DEPENDENCIES is enabled.

This patch also includes a change for a driver which does call
of_platform_populate() itself. This shouldn't be done or necessary
when CONFIG_OF_DEPENDENCIES is enabled.

Signed-off-by: Alexander Holler <***@ahsoftware.de>
---
arch/arm/mach-omap2/pdata-quirks.c | 8 ++++++++
drivers/pwm/pwm-tipwmss.c | 5 +++++
2 files changed, 13 insertions(+)

diff --git a/arch/arm/mach-omap2/pdata-quirks.c b/arch/arm/mach-omap2/pdata-quirks.c
index c33e07e..80becdb 100644
--- a/arch/arm/mach-omap2/pdata-quirks.c
+++ b/arch/arm/mach-omap2/pdata-quirks.c
@@ -13,6 +13,7 @@
#include <linux/init.h>
#include <linux/kernel.h>
#include <linux/of_platform.h>
+#include <linux/of_dependencies.h>
#include <linux/wl12xx.h>

#include <linux/platform_data/pinctrl-single.h>
@@ -312,7 +313,14 @@ void __init pdata_quirks_init(struct of_device_id *omap_dt_match_table)
{
omap_sdrc_init(NULL, NULL);
pdata_quirks_check(auxdata_quirks);
+#ifdef CONFIG_OF_DEPENDENCIES
+ if (!of_init_build_order(NULL, NULL))
+ of_init_create_devices(NULL, omap_auxdata_lookup);
+ else
+ of_init_free_order();
+#else
of_platform_populate(NULL, omap_dt_match_table,
omap_auxdata_lookup, NULL);
+#endif
pdata_quirks_check(pdata_quirks);
}
diff --git a/drivers/pwm/pwm-tipwmss.c b/drivers/pwm/pwm-tipwmss.c
index 3b119bc..786fa39 100644
--- a/drivers/pwm/pwm-tipwmss.c
+++ b/drivers/pwm/pwm-tipwmss.c
@@ -78,10 +78,15 @@ static int pwmss_probe(struct platform_device *pdev)
pm_runtime_get_sync(&pdev->dev);
platform_set_drvdata(pdev, info);

+#ifndef CONFIG_OF_DEPENDENCIES
/* Populate all the child nodes here... */
ret = of_platform_populate(node, NULL, NULL, &pdev->dev);
if (ret)
dev_err(&pdev->dev, "no child node found\n");
+#else
+ /* dependency based initialization already has setup devices */
+ ret = 0;
+#endif

return ret;
}
--
1.8.3.1
Alexander Holler
2014-05-13 15:40:34 UTC
Permalink
The subject says all. Patch 5/9 has a bug which avoids registering late drivers
if OF_DEPENDENCIES is disabled.

This also explains the large differences in boot times I've experienced when
comparing boot times with and without DT dependency based initialization order.

Signed-off-by: Alexander Holler <holler-SXC+***@public.gmane.org>

---
init/main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/init/main.c b/init/main.c
index 7591cd1..e16e2b4 100644
--- a/init/main.c
+++ b/init/main.c
@@ -799,9 +799,9 @@ static void __init do_basic_setup(void)
of_init_set_recording(false);
/* probe available platform drivers with deterministic order */
of_init_register_drivers();
+#endif
/* register late drivers */
do_initcall_level(ARRAY_SIZE(initcall_levels) - 3);
-#endif
random_int_secret_init();
}
--
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alexander Holler
2014-05-13 19:27:17 UTC
Permalink
In some rare cases it might make sense for not wanting an automatic
dependency for some used phandles.

E.g. if a cpu depends on a regulator which depends on i2c which depends
on some bus and you want that the bus depends on the cpu. That would
end up in a cycle. Usually such doesn't make sense because the hw doesn't
support such circular dependencies too (you always have to start somewhere
with initializing things). But e.g. in the case of the regulator the cpu
depends on, it's very likely that the regulator was initialized
automatically (otherwise the cpu won't run), so there is no real need to
initialize the regulator before the cpu. But that's just an example for
one of such rare cases where it might make sense to avoid an otherwise
automatically added dependency.

The syntax is like

bar: whatever {
...
};
foo {
my-option = <&bar 1 2 3>;
no-dependencies = <&bar>;
};

Without that 'no-dependencies' property dtc would automatically add a
dependency to bar to the property 'dependencies' of the node foo.
But with that 'no-dependencies' it will not automatically add the
listed dependencies.

The property 'no-dependencies' is virtual property and will not be added
to any output file.

Signed-off-by: Alexander Holler <holler-SXC+***@public.gmane.org>
---
scripts/dtc/dependencies.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)

diff --git a/scripts/dtc/dependencies.c b/scripts/dtc/dependencies.c
index 4579f6f..06f447b 100644
--- a/scripts/dtc/dependencies.c
+++ b/scripts/dtc/dependencies.c
@@ -73,6 +73,16 @@ static void add_deps(struct node *dt, struct node *node, struct property *prop)
is_parent_of(target, source))
continue;
phandle = get_node_phandle(dt, target);
+ prop_deps = get_property(node, "no-dependencies");
+ if (prop_deps) {
+ /* avoid adding non-dependencies */
+ cell = (cell_t *)prop_deps->val.val;
+ for (i = 0; i < prop_deps->val.len/4; ++i)
+ if (*cell++ == cpu_to_fdt32(phandle))
+ break;
+ if (i < prop_deps->val.len/4)
+ continue;
+ }
prop_deps = get_property(source, "dependencies");
if (!prop_deps) {
add_property(source,
@@ -102,9 +112,21 @@ static void process_nodes_props(struct node *dt, struct node *node)
process_nodes_props(dt, child);
}

+static void del_prop_no_dependencies(struct node *node)
+{
+ struct node *child;
+
+ if (!node)
+ return;
+ delete_property_by_name(node, "no-dependencies");
+ for_each_child(node, child)
+ del_prop_no_dependencies(child);
+}
+
void add_dependencies(struct boot_info *bi)
{
process_nodes_props(bi->dt, bi->dt);
+ del_prop_no_dependencies(bi->dt);
}

/*
--
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alexander Holler
2014-05-14 08:20:43 UTC
Permalink
Hello,

to make it a bit more easier to evaluate or debug this new feature, here
are some tips:

- To see the initialization order you can use dtc:

CROSS_COMPILE=arm-linux-gnu- ARCH=arm make foo.dtb
scripts/dtc/dtc -I dtb -t arch/arm/boot/dts/foo.dtb

- To see that order in dmesg, I've left a commented function
in patch 2/9. Just enable of_init_print_order() in
drivers/of/of_dependencies.c

- To see which drivers do call of_platform_populate() theirself
(which is not necessary when using this new feature) uncomment the
WARN_ON(np->dev_created); in drivers/of/platform.c (patch 2/9).

- To see which drivers are already "well done" or not, add a small
debug line to of_init_register_platform_driver() in
drivers/of/of_dependencies.c:

@@ -416,39 +416,41 @@ int of_init_register_platform_driver(struct
platform_driver *drv)
{
BUG_ON(!is_recording);
order.platform_drivers[order.count_drivers++] = drv;
+ pr_info("DEPS: recording of drv %s\n", drv->driver.name);
return 0;
}

Now "well done" drivers linked to the kernel will be seen in dmesg as
beeing recorded. This also shows most drivers which are not "well
done", they will be started before the recording of drivers (except
late ones).

- To see when "well done" drivers will be registered (in order) add
something like that to drivers/of/of_dependencies.c:
@@ -445,8 +445,10 @@ void __init of_init_register_drivers(void)
if (of_driver_match_device(dev, &drv->driver)) {
if (!driver_find(drv->driver.name,
- drv->driver.bus))
+ drv->driver.bus)) {
platform_driver_register(drv);
+ pr_info("DEPS: driver %s
registered\n", drv->name);
+ }
if (dev->parent)
device_lock(dev->parent);
rc = device_attach(dev);


Regards,

Alexander Holler
Grant Likely
2014-05-14 14:19:14 UTC
Permalink
Post by Alexander Holler
Hello,
if I would have to describe the Linux kernels init system (before userspace
Unknown functions with almost unknown functionality are called in an almost
random order.
"We should aim to make things synchronous, deterministic and
stuff-happens-in-the-correct-order."
Looks like either the target moved or no Wilhelm Tell was around. ;)
This is an attempt to reach the target for the case of (platform-)drivers.
It is a mere starting point to reach the final target but it works on two
DT enabled ARM devices I have and it hasn't any implications on other
architectures, platforms or whatever. If the new configuration option,
which is only available if DT is enabled, isn't turned on, there is no
increase of code size or similiar.
So what are these patches I'm posting here?
They offer an imho solid base to fix the 3. problem. they build a deterministic
order in which (platform-)drivers should be initialized, based on datas
(dependencies) found in the device tree. They also offer a starting point to fix
the other 2 problems (unknown functions and unknown functionality) by showing a
way how the long range target of known functions with known functionality could
be reached.
Unfortunately work still isn't done. As written above, this is just a starting
point, neiter complete nor perfect. It is what I could do behind closed doors,
by spending a limited amount of time and resources (I've started to look at
that stuff 3-4 weeks ago, sometimes after 3.14 appeared), so it can be blamed
quick & dirty. But it should be already enough to explain and test the concepts.
Enough forewords.
This is a small patch series to use a deterministic dependency based device
and driver initialization order on machines which are using device tree.
The dependency graph will not only be build based on the device tree itself,
but will use dependencies based on phandle references in the .dts,
automatically added by dtc through a new property.
Manualy adding dependencies to the .dts is possible too.
- Correct order of initialization without any "dirty tricks" in drivers or the
machine init code. The order in which devices/drivers will be initialized
depends only on the DT and the topological sort algorithm used by the
kernel, not some code elsewhere. That means less code and more homogeneity
across different SOCs.
- Might be(come) a little faster because the number of deferred probes should
be minimized (they might not even be necessary anymore at all).
- Using a modified algorithm, it's possible to build a list of drivers which
can be initialized in parallel, e.g. using n threads, where n equals the
number of cores. I have not tested nor implemented it, because I don't have
any multicore DT based board with which I easily can use a (patched) mainline
kernel, just locked down multicore things I don't want to go through the
pain of unlocking them.
- Funny dependency graphs when using Graphviz.
- To use this feature correctly, binary blobs must be regenerated by
recompiling them to include properties with dependencies. Without
recompiling them there will be no advantage.
Rather than a dtb schema change, for the most common properties (irqs,
clocks, gpios), we could extract dependencies at boot time. I don't like
the idea of adding a separate depends-on property because it is very
easy to get it out of sync with the actual binding data (dtc is not the
only tool that manipulates .dtbs. Firmware will fiddle with it too).

g.
Post by Alexander Holler
- Binary blobs will be slightly larger (some kb, see numbers below).
- Many .dts might need manual work to add some dependencies. Also most
dependencies will be discovered automatically (based on phandle references
in the .dts, some devices might need additional dependencies.
Works out of the box.
Size of dtb without dependencies: 9166
Size of dtb with dependencies: 9579
Graph with 34 nodes, 56 edges
1.325474 1.325458 1.325449 1.325494
4.509989 4.484608 4.316221 4.485310
The large difference in time isn't because of the depency based
init but because ehci detected the connected hd before the panic
occured when deps were enabled. Withoout deps, the panic
already happend without any discovered usb-storage. I haven't
checked why.
The actual times to boot from USB-HD are 3.417248 without
dependencies versus 5.618293 with. I still have to check where
the difference of more than a second does come from, a difference
like on the BBB (see below) should be more reasonable.
Had to disable ethernet (driver crashes).
Size of dtb without dependencies: 31379
Size of dtb with dependencies: 33300
Graph with 145 nodes, 266 edges
1.229431 1.229516 1.229509 1.229557
1.361780 1.361442 1.361532 1.361699
Had to disable usb (driver crashes) and several other problems,
but an unpatched 3.14.y doesn't really work too (which was the
reason I looked at what happes and did these patches).
Size of dtb without dependencies: 57003
Size of dtb with dependencies: 62580
Graph with 390 nodes, 848 edges
3.386535 3.343566 3.381469 3.357208
5.961425 5.907714 6.053680 5.957855
The difference in boot time is mainly based on the function which
matches drivers to devices based on the compatible string. This
is currently not solved very elegant and walks through multiple
list multiple times. The actual sorting is very fast (just some ms).
For people which do like pictures, I've put the dependency graph for
the Dockstar online: http://ahsoftware.de/dt-kirkwood-dockstar.svg
http://ahsoftware.de/dt-omap3-beagle.svg
These pictures makes it easy to see what this feature is about. All the cyan
arrows are the new dependencies, the black ones are the dependecies as
currently used (for device but not driver initialization). So you see, there
is quiet a difference.
If I'm right, those pictures also visualize which drivers could be initialized
in parallel (I haven't checked it, I would have to read the Graphviz
documentation or check the whole graph to be sure). But on a first look at it,
it looks like dot (Graphviz) placed all the nodes which can be initialized in
parallel on the same row. So there are quiet some drivers which could be
initialized in parallel, taking advantage of multiple cores to reduce boot time.
(Just in case you wonder what the first number in the nodes in the pictures
might be, it is the phandle of the device tree node and is usually generated
automatically by dtc and might change when the .dts changes. This number
doesn't have any meaning in regard to the initialization order.)
What follows are the patches I've done, ready to test this feature. These are
- 3 patches for the in-kernel dtc (to add dependencies to the dtb),
- 1 patch for the kernel to use these dependencies to build the initialization
order and create devices,
- 1 patch to register drivers based on the built order,
- 1 patch to show which changes are necessary for some drivers,
- 2 patches to use this feature on kirkwood (very good as small example),
- 1 patch to use this feature on omap2+ (and am33xx),
These patches are based on 3.14.3 and unfortunately don't apply cleanly to
3.15-rcN. But I'm currently too lazy to rebase them as I usually use a stable
kernel to test things I change. And as this is just a RFC, please use 3.14.x
to test these patches.
All patches do explain further what and how they do. And if I get the OK
that they will be merged (after any necessary clean up), I would write
a document for Documentation too (if really wanted, I assume you already have
noticed that I'm not a native english speaker/writer).
- Merge the first 5 patches (maybe after they got cleaned up). So they won't
disappear and people will find a starting point in the kernel to continue
work on. They don't do any harm and don't increase codesize if the new
kernel option they introduce is disabled. It also might be a good idea to
merge them in order to get the new dependencies into binary DT blobs
as early as possible, even if it might still need some time until they
really will be used.
- Have a look at the other patches. Especially the one for the Kirkwood which
changes the initializazion order by just adding one dependency (ehci vs.
regulator) to the .dts. This shows how such could be done without any changes
on the drivers source code.
If you ask why I did those patches: For the same reason a mountain climber
does climb a mountain. That also explains my limited motivation and
resources. ;)
Regards,
Alexander Holler
Please keep away with comments about style, typos or spelling errors in
comments and similiar silly stuff if you have nothing else to say.
Feel free to comment functionality or real errors, but not style, form or
other bureaucrazy things.
And please keep in mind that I'm neiter your intern, your student, your pupil,
nor your child.
_______________________________________________
linux-arm-kernel mailing list
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alexander Holler
2014-05-14 15:02:24 UTC
Permalink
Post by Grant Likely
Rather than a dtb schema change, for the most common properties (irqs,
clocks, gpios), we could extract dependencies at boot time. I don't like
the idea of adding a separate depends-on property because it is very
easy to get it out of sync with the actual binding data (dtc is not the
only tool that manipulates .dtbs. Firmware will fiddle with it too).
Then that stuff has to fiddle correct. Sorry, but trying to solve all
problems right from the beginning just leads to endless talks with no
end and nothing will happen at all because nobody aggrees how to start.

Regards,

Alexander Holler
Grant Likely
2014-05-14 16:05:26 UTC
Permalink
Post by Alexander Holler
Post by Grant Likely
Rather than a dtb schema change, for the most common properties (irqs,
clocks, gpios), we could extract dependencies at boot time. I don't like
the idea of adding a separate depends-on property because it is very
easy to get it out of sync with the actual binding data (dtc is not the
only tool that manipulates .dtbs. Firmware will fiddle with it too).
Then that stuff has to fiddle correct. Sorry, but trying to solve all
problems right from the beginning just leads to endless talks with no end
and nothing will happen at all because nobody aggrees how to start.
I appreciate the problem that you're trying to solve and why you're
using the dtc approach. My job is to poke at the solution and make
sure it is going to be reliable. Making sure all users know how to
fiddle with the new property correctly is not a trivial problem,
especially when it is firmware that will not necessarily be updated.

I'm not saying flat out 'no' here, but before I merge anything, I have
to be reasonably certain that the feature is not going to represent a
maintenance nightmare over the long term.

g.
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alexander Holler
2014-05-14 16:23:13 UTC
Permalink
Post by Grant Likely
Post by Alexander Holler
Post by Grant Likely
Rather than a dtb schema change, for the most common properties (irqs,
clocks, gpios), we could extract dependencies at boot time. I don't like
the idea of adding a separate depends-on property because it is very
easy to get it out of sync with the actual binding data (dtc is not the
only tool that manipulates .dtbs. Firmware will fiddle with it too).
Then that stuff has to fiddle correct. Sorry, but trying to solve all
problems right from the beginning just leads to endless talks with no end
and nothing will happen at all because nobody aggrees how to start.
I appreciate the problem that you're trying to solve and why you're
using the dtc approach. My job is to poke at the solution and make
sure it is going to be reliable. Making sure all users know how to
fiddle with the new property correctly is not a trivial problem,
especially when it is firmware that will not necessarily be updated.
The answer is just that they don't have to use this feature.

It is more meant as a long-term solution to fix for the problem of
increasing hard-coded workarounds which all are trying to fix the
initialization order of drivers. Hardware has become a lot more
complicated than it was in the good old days, and I think the time is
right trying to adopt the init-system to this new century instead of
still adding workarounds here and there.
Post by Grant Likely
I'm not saying flat out 'no' here, but before I merge anything, I have
to be reasonably certain that the feature is not going to represent a
maintenance nightmare over the long term.
The maintenance nightmare is already present in form of all the
workarounds which are trying to fix the initialzation order necessary
for modern hardware.

Regards,

Alexander

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Rob Herring
2014-05-14 17:30:07 UTC
Permalink
Post by Alexander Holler
Post by Grant Likely
Post by Alexander Holler
Post by Grant Likely
Rather than a dtb schema change, for the most common properties (irqs,
clocks, gpios), we could extract dependencies at boot time. I don't like
the idea of adding a separate depends-on property because it is very
easy to get it out of sync with the actual binding data (dtc is not the
only tool that manipulates .dtbs. Firmware will fiddle with it too).
Then that stuff has to fiddle correct. Sorry, but trying to solve all
problems right from the beginning just leads to endless talks with no end
and nothing will happen at all because nobody aggrees how to start.
I appreciate the problem that you're trying to solve and why you're
using the dtc approach. My job is to poke at the solution and make
sure it is going to be reliable. Making sure all users know how to
fiddle with the new property correctly is not a trivial problem,
especially when it is firmware that will not necessarily be updated.
The answer is just that they don't have to use this feature.
It's not just about users, but maintainers have to carry the code and
anything tied to DT is difficult to change or remove.

Lots of inter-dependencies are already described in DT. We should
leverage those first and then look at how to add dependencies that are
not described.
Post by Alexander Holler
It is more meant as a long-term solution to fix for the problem of
increasing hard-coded workarounds which all are trying to fix the
initialization order of drivers. Hardware has become a lot more complicated
than it was in the good old days, and I think the time is right trying to
adopt the init-system to this new century instead of still adding
workarounds here and there.
I don't know when the good old days were, but this has been a problem
in embedded systems for as long as I have worked on Linux.
Post by Alexander Holler
Post by Grant Likely
I'm not saying flat out 'no' here, but before I merge anything, I have
to be reasonably certain that the feature is not going to represent a
maintenance nightmare over the long term.
The maintenance nightmare is already present in form of all the workarounds
which are trying to fix the initialzation order necessary for modern
hardware.
Do you have concrete examples or cases where deferred probe does not work?

Rob
Alexander Holler
2014-05-14 17:45:54 UTC
Permalink
Post by Rob Herring
Post by Alexander Holler
Post by Grant Likely
Post by Alexander Holler
Post by Grant Likely
Rather than a dtb schema change, for the most common properties (irqs,
clocks, gpios), we could extract dependencies at boot time. I don't like
the idea of adding a separate depends-on property because it is very
easy to get it out of sync with the actual binding data (dtc is not the
only tool that manipulates .dtbs. Firmware will fiddle with it too).
Then that stuff has to fiddle correct. Sorry, but trying to solve all
problems right from the beginning just leads to endless talks with no end
and nothing will happen at all because nobody aggrees how to start.
I appreciate the problem that you're trying to solve and why you're
using the dtc approach. My job is to poke at the solution and make
sure it is going to be reliable. Making sure all users know how to
fiddle with the new property correctly is not a trivial problem,
especially when it is firmware that will not necessarily be updated.
The answer is just that they don't have to use this feature.
It's not just about users, but maintainers have to carry the code and
anything tied to DT is difficult to change or remove.
Lots of inter-dependencies are already described in DT. We should
leverage those first and then look at how to add dependencies that are
not described.
Again, that's what this feature is about. One of the problems it solves
is that those dependencies which are described in the DT source in form
of phandle reference, do disappear in the blobs because the init-system
would have to know all bindings in order to identify phandle references
(the dependencies) again.
Post by Rob Herring
Post by Alexander Holler
It is more meant as a long-term solution to fix for the problem of
increasing hard-coded workarounds which all are trying to fix the
initialization order of drivers. Hardware has become a lot more complicated
than it was in the good old days, and I think the time is right trying to
adopt the init-system to this new century instead of still adding
workarounds here and there.
I don't know when the good old days were, but this has been a problem
in embedded systems for as long as I have worked on Linux.
Yes, but stuff wasn't as complicated as today, which means it was
relatively easy to manualy solve dependency problems. But if you look at
complicated SOCs like the OMAP, it's much better to let the machine
solve the dependencies to get the initialization order instead of still
trying to do this manually.
Post by Rob Herring
Post by Alexander Holler
Post by Grant Likely
I'm not saying flat out 'no' here, but before I merge anything, I have
to be reasonably certain that the feature is not going to represent a
maintenance nightmare over the long term.
The maintenance nightmare is already present in form of all the workarounds
which are trying to fix the initialzation order necessary for modern
hardware.
Do you have concrete examples or cases where deferred probe does not work?
Why do people come back to the deferred probe stuff?

One of the biggest problem of the deferred probe stuff is the problem
how to identify real problems if everything ends up with a deferred
probe when an error occurs? That means if you display an error whenever
something is deferred, the log becomes almost unreadable. If you don't
display an error, you never will see an error. And how do you display
the real error when deferred probes finally do fail? The deferred probe
stuff doesn't has any information about the underlying error, so it
can't display it.

Anyway, this feature is totally independ of the deferred probe stuff and
both can friendly live together.

Regards,

Alexander Holler
Alexander Holler
2014-05-14 17:53:58 UTC
Permalink
Post by Alexander Holler
One of the biggest problem of the deferred probe stuff is the problem
how to identify real problems if everything ends up with a deferred
probe when an error occurs? That means if you display an error whenever
something is deferred, the log becomes almost unreadable. If you don't
display an error, you never will see an error. And how do you display
the real error when deferred probes finally do fail? The deferred probe
stuff doesn't has any information about the underlying error, so it
can't display it.
And that is a real problem. I've recently tried to identify why a driver
failed and it was a nightmare because nothing offered any message (debug
or not) when a probe was deferred. So I had to insert tons of printks to
walk upwards to find the finally place where the probe failed.
Everything afterwards just has forwarded the -EPROBE_DEFER without
printing any message.

Regards,

Alexander Holler
Alexander Holler
2014-05-14 18:16:53 UTC
Permalink
Post by Alexander Holler
Post by Alexander Holler
One of the biggest problem of the deferred probe stuff is the problem
how to identify real problems if everything ends up with a deferred
probe when an error occurs? That means if you display an error whenever
something is deferred, the log becomes almost unreadable. If you don't
display an error, you never will see an error. And how do you display
the real error when deferred probes finally do fail? The deferred probe
stuff doesn't has any information about the underlying error, so it
can't display it.
And that is a real problem. I've recently tried to identify why a driver
failed and it was a nightmare because nothing offered any message (debug
or not) when a probe was deferred. So I had to insert tons of printks to
walk upwards to find the finally place where the probe failed.
Everything afterwards just has forwarded the -EPROBE_DEFER without
printing any message.
To add some numbers, I had to insert around 20-30 printks in around 10
or more files to find the underlying problem. Having to do such whenever
an error happens because everything assumes the error will disappear in
a later try, which doesn't happen for real errors, is just a nightmare.

Regards,

Alexander Holler

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alexander Holler
2014-05-14 19:13:50 UTC
Permalink
Post by Alexander Holler
Post by Alexander Holler
Post by Alexander Holler
One of the biggest problem of the deferred probe stuff is the problem
how to identify real problems if everything ends up with a deferred
probe when an error occurs? That means if you display an error whenever
something is deferred, the log becomes almost unreadable. If you don't
display an error, you never will see an error. And how do you display
the real error when deferred probes finally do fail? The deferred probe
stuff doesn't has any information about the underlying error, so it
can't display it.
And that is a real problem. I've recently tried to identify why a driver
failed and it was a nightmare because nothing offered any message (debug
or not) when a probe was deferred. So I had to insert tons of printks to
walk upwards to find the finally place where the probe failed.
Everything afterwards just has forwarded the -EPROBE_DEFER without
printing any message.
To add some numbers, I had to insert around 20-30 printks in around 10
or more files to find the underlying problem. Having to do such whenever
an error happens because everything assumes the error will disappear in
a later try, which doesn't happen for real errors, is just a nightmare.
And to give other people an idea how such a nightmare which has become
reality does look like:

You see a driver fails (through the deferred stuff). You look at that
the driver and see around 5-10 places which return or forward an
-EPROBE_DEFER. You add printks (to all or just some of them, hopefully
the right ones, but Murphy ...). Then you go to the underlying
functions. You see again several places which do the same, you add again
printks. You go to the underlying functions ...

Finally you've created a tree full with nodes of printks, searching for
the one leaf which is the origin of the -EPROBE_DEFER for your problem.

Regards,

Alexander Holler
Rob Herring
2014-05-14 19:06:31 UTC
Permalink
Post by Rob Herring
Post by Alexander Holler
Post by Grant Likely
Post by Alexander Holler
Post by Grant Likely
Rather than a dtb schema change, for the most common properties (irqs,
clocks, gpios), we could extract dependencies at boot time. I don't like
the idea of adding a separate depends-on property because it is very
easy to get it out of sync with the actual binding data (dtc is not the
only tool that manipulates .dtbs. Firmware will fiddle with it too).
Then that stuff has to fiddle correct. Sorry, but trying to solve all
problems right from the beginning just leads to endless talks with no end
and nothing will happen at all because nobody aggrees how to start.
I appreciate the problem that you're trying to solve and why you're
using the dtc approach. My job is to poke at the solution and make
sure it is going to be reliable. Making sure all users know how to
fiddle with the new property correctly is not a trivial problem,
especially when it is firmware that will not necessarily be updated.
The answer is just that they don't have to use this feature.
It's not just about users, but maintainers have to carry the code and
anything tied to DT is difficult to change or remove.
Lots of inter-dependencies are already described in DT. We should
leverage those first and then look at how to add dependencies that are
not described.
Again, that's what this feature is about. One of the problems it solves is
that those dependencies which are described in the DT source in form of
phandle reference, do disappear in the blobs because the init-system would
have to know all bindings in order to identify phandle references (the
dependencies) again.
They don't disappear, but they become binding specific to recover.
What you are loosing is type information which is something we would
like to solve as well.

You can regenerate or figure out the dependencies with knowledge of
the binding. The of_irq_init code does this. Maintaining this
information in the dtb that can be parsed in a generic way and having
the kernel handle non-bus oriented dependencies are really 2 separate
problems. Let's not try to solve it all at once.
Post by Rob Herring
Post by Alexander Holler
Post by Grant Likely
I'm not saying flat out 'no' here, but before I merge anything, I have
to be reasonably certain that the feature is not going to represent a
maintenance nightmare over the long term.
The maintenance nightmare is already present in form of all the workarounds
which are trying to fix the initialzation order necessary for modern
hardware.
Do you have concrete examples or cases where deferred probe does not work?
Why do people come back to the deferred probe stuff?
Because it is there today and generally works.
One of the biggest problem of the deferred probe stuff is the problem how to
identify real problems if everything ends up with a deferred probe when an
error occurs? That means if you display an error whenever something is
deferred, the log becomes almost unreadable. If you don't display an error,
you never will see an error. And how do you display the real error when
deferred probes finally do fail? The deferred probe stuff doesn't has any
information about the underlying error, so it can't display it.
This all sounds like "I don't like deferred probe because it is hard
to debug" to me. This all sounds solvable with better instrumentation
and debug capability. Why probe is deferred should be available at the
source when deciding to return -EPROBE_DEFER.

I still have not seen an example of A depends on B, deferred probe
fails because of ? and here is the code for A that works around the
problem.
Anyway, this feature is totally independ of the deferred probe stuff and
both can friendly live together.
Yes, except then we get to maintain both.

Rob
Alexander Holler
2014-05-14 19:24:36 UTC
Permalink
Post by Rob Herring
Post by Rob Herring
Post by Alexander Holler
Post by Grant Likely
Post by Alexander Holler
Post by Grant Likely
Rather than a dtb schema change, for the most common properties (irqs,
clocks, gpios), we could extract dependencies at boot time. I don't like
the idea of adding a separate depends-on property because it is very
easy to get it out of sync with the actual binding data (dtc is not the
only tool that manipulates .dtbs. Firmware will fiddle with it too).
Then that stuff has to fiddle correct. Sorry, but trying to solve all
problems right from the beginning just leads to endless talks with no end
and nothing will happen at all because nobody aggrees how to start.
I appreciate the problem that you're trying to solve and why you're
using the dtc approach. My job is to poke at the solution and make
sure it is going to be reliable. Making sure all users know how to
fiddle with the new property correctly is not a trivial problem,
especially when it is firmware that will not necessarily be updated.
The answer is just that they don't have to use this feature.
It's not just about users, but maintainers have to carry the code and
anything tied to DT is difficult to change or remove.
Lots of inter-dependencies are already described in DT. We should
leverage those first and then look at how to add dependencies that are
not described.
Again, that's what this feature is about. One of the problems it solves is
that those dependencies which are described in the DT source in form of
phandle reference, do disappear in the blobs because the init-system would
have to know all bindings in order to identify phandle references (the
dependencies) again.
They don't disappear, but they become binding specific to recover.
What you are loosing is type information which is something we would
like to solve as well.
You can regenerate or figure out the dependencies with knowledge of
the binding. The of_irq_init code does this. Maintaining this
information in the dtb that can be parsed in a generic way and having
the kernel handle non-bus oriented dependencies are really 2 separate
problems. Let's not try to solve it all at once.
Post by Rob Herring
Post by Alexander Holler
Post by Grant Likely
I'm not saying flat out 'no' here, but before I merge anything, I have
to be reasonably certain that the feature is not going to represent a
maintenance nightmare over the long term.
The maintenance nightmare is already present in form of all the workarounds
which are trying to fix the initialzation order necessary for modern
hardware.
Do you have concrete examples or cases where deferred probe does not work?
Why do people come back to the deferred probe stuff?
Because it is there today and generally works.
One of the biggest problem of the deferred probe stuff is the problem how to
identify real problems if everything ends up with a deferred probe when an
error occurs? That means if you display an error whenever something is
deferred, the log becomes almost unreadable. If you don't display an error,
you never will see an error. And how do you display the real error when
deferred probes finally do fail? The deferred probe stuff doesn't has any
information about the underlying error, so it can't display it.
This all sounds like "I don't like deferred probe because it is hard
to debug" to me. This all sounds solvable with better instrumentation
and debug capability. Why probe is deferred should be available at the
source when deciding to return -EPROBE_DEFER.
I still have not seen an example of A depends on B, deferred probe
fails because of ? and here is the code for A that works around the
problem.
Anyway, this feature is totally independ of the deferred probe stuff and
both can friendly live together.
Yes, except then we get to maintain both.
Goodbye and thanks for all the fish.

Sorry, but my patience in dealing with Linux kernel maintainers was
already almost zero before I've posted these patches and I have to
realize that only fools still try to do so.

Alexander Holler

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alexander Holler
2014-05-15 01:46:08 UTC
Permalink
Post by Rob Herring
I still have not seen an example of A depends on B, deferred probe
fails because of ? and here is the code for A that works around the
problem.
Post by Alexander Holler
Anyway, this feature is totally independ of the deferred probe stuff and
both can friendly live together.
Yes, except then we get to maintain both.
And just in case someone still hasn't realized what the goal of a
deterministic initialization order is, have a look at this snippet from
arch/arm/mach-omap2/gpio.c:

--------
/*
* gpio_init needs to be done before
* machine_init functions access gpio APIs.
* Hence gpio_init is a omap_postcore_initcall.
*/
static int __init omap2_gpio_init(void)
{
/* If dtb is there, the devices will be created dynamically */
if (of_have_populated_dt())
return -ENODEV;

return omap_hwmod_for_each_by_class("gpio", omap2_gpio_dev_init, NULL);
}
omap_postcore_initcall(omap2_gpio_init);
--------
(Sorry to the OMAP guys, it isn't there fault that it has to look like
this.)

But this is ecactly what should be avoided and why the kernel is in need
of a deterministic, easy to setup, initialization order. And deferred
probes are in now way a help to reach that target, in fact they even
support such stuff. Does anybody outside the OMAP crew do understand
what that piece of code does? The answer is pretty likely no. Again
sorry to the OMAP guys, I'm pretty sure that code was born out of
necessity because no other mechanism is available to get things in order.

Regards,

Alexander Holler

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alexander Holler
2014-05-14 23:00:33 UTC
Permalink
Post by Rob Herring
Post by Rob Herring
Post by Alexander Holler
Post by Grant Likely
Post by Alexander Holler
Post by Grant Likely
Rather than a dtb schema change, for the most common properties (irqs,
clocks, gpios), we could extract dependencies at boot time. I don't like
the idea of adding a separate depends-on property because it is very
easy to get it out of sync with the actual binding data (dtc is not the
only tool that manipulates .dtbs. Firmware will fiddle with it too).
Then that stuff has to fiddle correct. Sorry, but trying to solve all
problems right from the beginning just leads to endless talks with no end
and nothing will happen at all because nobody aggrees how to start.
I appreciate the problem that you're trying to solve and why you're
using the dtc approach. My job is to poke at the solution and make
sure it is going to be reliable. Making sure all users know how to
fiddle with the new property correctly is not a trivial problem,
especially when it is firmware that will not necessarily be updated.
The answer is just that they don't have to use this feature.
It's not just about users, but maintainers have to carry the code and
anything tied to DT is difficult to change or remove.
Lots of inter-dependencies are already described in DT. We should
leverage those first and then look at how to add dependencies that are
not described.
Again, that's what this feature is about. One of the problems it solves is
that those dependencies which are described in the DT source in form of
phandle reference, do disappear in the blobs because the init-system would
have to know all bindings in order to identify phandle references (the
dependencies) again.
They don't disappear, but they become binding specific to recover.
What you are loosing is type information which is something we would
like to solve as well.
You can regenerate or figure out the dependencies with knowledge of
the binding. The of_irq_init code does this. Maintaining this
information in the dtb that can be parsed in a generic way and having
the kernel handle non-bus oriented dependencies are really 2 separate
problems. Let's not try to solve it all at once.
Btw. I wonder if you have really read what I did and what I have
written. At first the need for knowledge of the binding is broken by
design and will never work for any general solution. Second, I've
already written almost the same as you've written in your first
paragraph in my pragraph you've just quoted directly above. And third,
if you like to solve that problem, I just posted a possible solution. ;)

You only have to take it, feel free to so, all the patches do contain a
Signed-off-by which means that I don't care much what you do with them.

Regards,

Alexander Holler
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Thierry Reding
2014-08-21 14:02:13 UTC
Permalink
Post by Grant Likely
Post by Alexander Holler
Hello,
if I would have to describe the Linux kernels init system (before userspace
Unknown functions with almost unknown functionality are called in an almost
random order.
"We should aim to make things synchronous, deterministic and
stuff-happens-in-the-correct-order."
Looks like either the target moved or no Wilhelm Tell was around. ;)
This is an attempt to reach the target for the case of (platform-)drivers.
It is a mere starting point to reach the final target but it works on two
DT enabled ARM devices I have and it hasn't any implications on other
architectures, platforms or whatever. If the new configuration option,
which is only available if DT is enabled, isn't turned on, there is no
increase of code size or similiar.
So what are these patches I'm posting here?
They offer an imho solid base to fix the 3. problem. they build a deterministic
order in which (platform-)drivers should be initialized, based on datas
(dependencies) found in the device tree. They also offer a starting point to fix
the other 2 problems (unknown functions and unknown functionality) by showing a
way how the long range target of known functions with known functionality could
be reached.
Unfortunately work still isn't done. As written above, this is just a starting
point, neiter complete nor perfect. It is what I could do behind closed doors,
by spending a limited amount of time and resources (I've started to look at
that stuff 3-4 weeks ago, sometimes after 3.14 appeared), so it can be blamed
quick & dirty. But it should be already enough to explain and test the concepts.
Enough forewords.
This is a small patch series to use a deterministic dependency based device
and driver initialization order on machines which are using device tree.
The dependency graph will not only be build based on the device tree itself,
but will use dependencies based on phandle references in the .dts,
automatically added by dtc through a new property.
Manualy adding dependencies to the .dts is possible too.
- Correct order of initialization without any "dirty tricks" in drivers or the
machine init code. The order in which devices/drivers will be initialized
depends only on the DT and the topological sort algorithm used by the
kernel, not some code elsewhere. That means less code and more homogeneity
across different SOCs.
- Might be(come) a little faster because the number of deferred probes should
be minimized (they might not even be necessary anymore at all).
- Using a modified algorithm, it's possible to build a list of drivers which
can be initialized in parallel, e.g. using n threads, where n equals the
number of cores. I have not tested nor implemented it, because I don't have
any multicore DT based board with which I easily can use a (patched) mainline
kernel, just locked down multicore things I don't want to go through the
pain of unlocking them.
- Funny dependency graphs when using Graphviz.
- To use this feature correctly, binary blobs must be regenerated by
recompiling them to include properties with dependencies. Without
recompiling them there will be no advantage.
Rather than a dtb schema change, for the most common properties (irqs,
clocks, gpios), we could extract dependencies at boot time. I don't like
the idea of adding a separate depends-on property because it is very
easy to get it out of sync with the actual binding data (dtc is not the
only tool that manipulates .dtbs. Firmware will fiddle with it too).
Sorry for reviving this old thread, but this issue keeps popping up time
and time again and people are starting to think about ways to workaround
deferred probe being slow.

I'd very much prefer to avoid that and just keep using deferred probing
as the one means to resolve dependencies at boot time, but at the same
time I can't ignore when several people start complaining about the same
thing.

This very recently popped up again in the context of DRM and panels. One
user complaints that in some cases, deferred probe increases boot time
by 1.5 seconds (best case) and 5-6 seconds (worst case). That doesn't
sound all that much, objectively, but if the boot time of the device is
on the order of 6 seconds or so, then it's quite a lot.

The reason why this happens is that the DRM driver looks up a panel
connected to it and return -EPROBE_DEFER if the panel hasn't registered
yet. Typically the panel would be probed after the DRM driver, therefore
causing the DRM driver to defer probing. But it can also happen that the
panel driver defers probing because it needs a regulator that hasn't
been probed yet.

Anyway, those are all fairly standard reasons for where deferred probe
triggers, and since I do like deferred probe for it's simplicity and
reliability I'd rather not try to work around it if boot time is all
that people are concerned about.

In order to solve the above problem I started investigating a bit what
it would take to parse dependency information from a device tree at boot
time. As it happens this can easily be done for things like clocks and
interrupts properties, because they have very well-defined generic
bindings. Unfortunately they are also the two types of resources where
deferred probing matters the least because they are so special that they
often get probed earlier than normal drivers (and in some cases even way
before initcalls).

Unfortunately for other bindings it's not that easy. Take GPIOs for
example. GPIOs can be specified in fairly arbitrary properties. There's
a concensus on using a -gpio or -gpios suffix, but I don't think there
are any guarantees. It could also technically be that a -gpio or -gpios
property doesn't in fact contain a phandle at all.

The same is true for regulators (*-supply) and likely a couple other
bindings. So one of the hurdles to take trying to obtain the dependency
information from existing DTBs is that there may be false positives (and
the code could therefore try to resolve non-phandles as phandles and
even succeed) and the code resolving the dependencies would need to know
a whole lot about bindings.

I suppose the latter problem could be solved by adding a way for
subsystems to register a "dependency resolver" that would implement the
knowledge about existing generic bindings of the subsystems. Another way
to push that even further down into drivers would be to make drivers
list the types of resources that they need in some structure. Arnd
(Cc'ed) proposed something akin to that a while ago with the goal of
removing a lot of boilerplate from device drivers. Such a list would
automatically "implement" the driver binding and could be used to
resolve dependencies when the driver is registered.

On the other hand, some of the solutions that workaround the problems
caused by deferred probing may actually be worthwhile to have in any
case. One such solution for the above-mentioned example of panels is to
make the DRM subsystem deal better with hotplugging panels. This is in a
way counterintuitive because those panels are not really hotpluggable
(you'd have to disassemble the device at runtime and remove or unsolder
some connector and even then there's no GPIO detecting their presence or
absence). However I can imagine that it would round things off nicely if
a panel driver could simply be unloaded and the result being that the OS
simply registers this as a display going away, much like with external
monitors.

Thierry
Alexander Holler
2014-08-21 19:19:00 UTC
Permalink
Post by Thierry Reding
Anyway, those are all fairly standard reasons for where deferred probe
triggers, and since I do like deferred probe for it's simplicity and
reliability I'd rather not try to work around it if boot time is all
that people are concerned about.
It's neither simple nor reliable. It's non deterministic brutforcing
while making it almost impossible to identify real errors.

In my humble opinion the worst way to solve something. I'm pretty sure
if I would have suggest such a solution, the maintainer crowd would have
eaten me without cooking.

Regards,

Alexander Holler

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Mark Rutland
2014-08-22 13:19:19 UTC
Permalink
Post by Alexander Holler
Post by Thierry Reding
Anyway, those are all fairly standard reasons for where deferred probe
triggers, and since I do like deferred probe for it's simplicity and
reliability I'd rather not try to work around it if boot time is all
that people are concerned about.
It's neither simple nor reliable. It's non deterministic brutforcing
while making it almost impossible to identify real errors.
It's horrible, yes.
Post by Alexander Holler
In my humble opinion the worst way to solve something. I'm pretty sure
if I would have suggest such a solution, the maintainer crowd would have
eaten me without cooking.
We didn't have a better workable solution at the time. Having a hack
that got boards booting was considered better than not having them boot.
I don't remember people being particularly enthralled by the idea.

Thanks,
Mark.
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alexander Holler
2014-08-22 15:45:01 UTC
Permalink
Post by Mark Rutland
Post by Alexander Holler
Post by Thierry Reding
Anyway, those are all fairly standard reasons for where deferred probe
triggers, and since I do like deferred probe for it's simplicity and
reliability I'd rather not try to work around it if boot time is all
that people are concerned about.
It's neither simple nor reliable. It's non deterministic brutforcing
while making it almost impossible to identify real errors.
It's horrible, yes.
Post by Alexander Holler
In my humble opinion the worst way to solve something. I'm pretty sure
if I would have suggest such a solution, the maintainer crowd would have
eaten me without cooking.
We didn't have a better workable solution at the time. Having a hack
that got boards booting was considered better than not having them boot.
I don't remember people being particularly enthralled by the idea.
Agreed. And usually I don't flame about workarounds. They are needed
practice usually born out of a time limited background or similiar
constraints.

Only Linux kernel maintainers do demand perfect stuff from others as the
kernel seems to have to be a perfect school project. I for myself
already think checkpatch is a ridiculous tool, only invented to drive
people crazy. Of course, it's better a tool drives people crazy than a
maintainer who make decisions based on the phase of the moon, but ... ;)

And I haven't flamed much about deferred probe before, but if I read
it's simple and reliable I couldn't stand still.

Sorry,

Alexander Holler
Thierry Reding
2014-08-25 09:39:32 UTC
Permalink
Post by Mark Rutland
Post by Alexander Holler
Post by Thierry Reding
Anyway, those are all fairly standard reasons for where deferred probe
triggers, and since I do like deferred probe for it's simplicity and
reliability I'd rather not try to work around it if boot time is all
that people are concerned about.
It's neither simple nor reliable. It's non deterministic brutforcing
while making it almost impossible to identify real errors.
It's horrible, yes.
Post by Alexander Holler
In my humble opinion the worst way to solve something. I'm pretty sure
if I would have suggest such a solution, the maintainer crowd would have
eaten me without cooking.
We didn't have a better workable solution at the time.
You make it sound like we've come up with a better workable solution in
the meantime.
Post by Mark Rutland
Having a hack that got boards booting was considered better than not
having them boot.
I don't remember people being particularly enthralled by the idea.
Odd, I remember things quite differently.

Anyway, instead of going back and forth between "deferred probe is good"
and "deferred probe is bad", how about we do something useful now and
concentrate on how to make use of the information we have in DT with the
goal to reduce the number of cases where deferred probing is required?

Thierry
Jon Loeliger
2014-08-25 13:08:59 UTC
Permalink
Post by Thierry Reding
Anyway, instead of going back and forth between "deferred probe is good"
and "deferred probe is bad", how about we do something useful now and
concentrate on how to make use of the information we have in DT with the
goal to reduce the number of cases where deferred probing is required?
Good idea.

The proposal on the table is to allow the probe code
to make a topological sort of the devices based on
dependency information either implied, explicitly stated
or both. That is likely a fundamentally correct approach.

I believe some of the issues that need to be resolved are:

1) What constitutes a dependency?
2) How is that dependency expressed?
3) How do we add missing dependencies?
4) Backward compatability problems.

There are other questions, of course. Is it a topsort
per bus? Are there required "early devices"? Should
the inter-node dependencies be expressed at each node,
or in a separate hierarchy within the DTS? Others.

HTH,
jdl
Thierry Reding
2014-08-25 13:37:16 UTC
Permalink
Post by Jon Loeliger
Post by Thierry Reding
Anyway, instead of going back and forth between "deferred probe is good"
and "deferred probe is bad", how about we do something useful now and
concentrate on how to make use of the information we have in DT with the
goal to reduce the number of cases where deferred probing is required?
Good idea.
The proposal on the table is to allow the probe code
to make a topological sort of the devices based on
dependency information either implied, explicitly stated
or both. That is likely a fundamentally correct approach.
1) What constitutes a dependency?
2) How is that dependency expressed?
3) How do we add missing dependencies?
4) Backward compatability problems.
There are other questions, of course. Is it a topsort
per bus? Are there required "early devices"? Should
the inter-node dependencies be expressed at each node,
or in a separate hierarchy within the DTS? Others.
I think Grant already objected to the idea of explicitly adding
dependency information into the device tree sources. Rather, if I
understand correctly, we should be using the information readily
available (phandle references) as much as possible before resorting to
additional properties.

So far we've been operating under the assumption that a dependency is
modeled as a phandle reference and that the dependent would contain the
phandle reference to the dependency. That's for example how clocks and
regulators (to name only a few) work. A simplified example would look
like this:

clock: ***@... {
...
#clock-cells = <1>;
...
};

pmic: ***@... {
regulators {
vdd_foo: ldo0 {
...
};

vdd_bar: ldo1 {
...
};
};

***@... {
vdd-supply = <&vdd_foo>;
clocks = <&clock 0>;
};

***@... {
vdd-supply = <&vdd_bar>;
clocks = <&clock 1>;
};

There are somewhat standardized bindings for the above and especially
for bindings of the type that clocks implement this is trivial. We can
simply iterate over each (phandle, specifier) tuple and check that the
corresponding clock provider can be resolved (which typically means that
it's been registered with the common clock framework).

For regulators (and regulator-like bindings) the problem is somewhat
more difficult because they property names are not standardized. One way
to solve this would be to look for property names with a -supply suffix,
but that could obviously lead to false positives. One alternative that I
think could eliminate this would be to explicitly list dependencies in
drivers. This would allow core code to step through such a list and
resolve the (phandle, specifier) tuples.

Clocks are usually not a problem with deferred probing since they often
are registered early anyway. Regulators on the other hand seem to be a
fairly common trigger, though, so they seem like a good candidate to
focus on.

Thierry
Jon Loeliger
2014-08-25 14:13:59 UTC
Permalink
Post by Thierry Reding
Post by Jon Loeliger
1) What constitutes a dependency?
2) How is that dependency expressed?
3) How do we add missing dependencies?
4) Backward compatability problems.
There are other questions, of course. Is it a topsort
per bus? Are there required "early devices"? Should
the inter-node dependencies be expressed at each node,
or in a separate hierarchy within the DTS? Others.
I think Grant already objected to the idea of explicitly adding
dependency information into the device tree sources.
Clearly, the reason to object to the introdcution of such
an artificial dependency implies that it would be trying
to express something that doesn't actually describe an
existing hardware requirement. That is, it wouldn't be
"describing hardware". That's fine.

But the inverse should also be true. That is, we should
ensure that where there *is* a hardware dependency that
is currently not expressed, we should introduce that
relationship.
Post by Thierry Reding
For regulators (and regulator-like bindings) the problem is somewhat
more difficult because they property names are not standardized. One way
to solve this would be to look for property names with a -supply suffix,
but that could obviously lead to false positives. One alternative that I
think could eliminate this would be to explicitly list dependencies in
drivers. This would allow core code to step through such a list and
resolve the (phandle, specifier) tuples.
So, express the "additional SW dependencies" in the SW?
Post by Thierry Reding
Clocks are usually not a problem with deferred probing since they often
are registered early anyway.
Ah, but how are they known to be needed early? A toposort should sort
them into that position. That's not currently done. And I doubt the
set of nodes and expressed dependencies would cause them to be done
early enough by today's standards.
Post by Thierry Reding
Regulators on the other hand seem to be a fairly common trigger,
though, so they seem like a good candidate to focus on.
Yeah. And I've seen some debatable IRQ-PHY-PCIe interaction too.
Post by Thierry Reding
Thierry
jdl
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Thierry Reding
2014-08-25 14:41:28 UTC
Permalink
Post by Jon Loeliger
Post by Thierry Reding
Post by Jon Loeliger
1) What constitutes a dependency?
2) How is that dependency expressed?
3) How do we add missing dependencies?
4) Backward compatability problems.
There are other questions, of course. Is it a topsort
per bus? Are there required "early devices"? Should
the inter-node dependencies be expressed at each node,
or in a separate hierarchy within the DTS? Others.
I think Grant already objected to the idea of explicitly adding
dependency information into the device tree sources.
Clearly, the reason to object to the introdcution of such
an artificial dependency implies that it would be trying
to express something that doesn't actually describe an
existing hardware requirement. That is, it wouldn't be
"describing hardware". That's fine.
But the inverse should also be true. That is, we should
ensure that where there *is* a hardware dependency that
is currently not expressed, we should introduce that
relationship.
Agreed. Any dependency currently not expressed probably indicates that
the device tree isn't complete yet and is a result of us coming up with
device trees as we go.

Using phandles to describe dependencies makes a lot of sense. As I
understand it the original intent was for OpenFirmware to use phandle to
resolve a "service provider" and call functions that it provided. That's
exactly what we need in Linux, too. Deferred probe is usually triggered
when one device's driver needs to access services provided by a device
that hasn't been registered yet. The way to find such a service provider
is by looking up the phandle (really the struct device_node representing
the referenced node) in a subsystem-specific registry.

Therefore it should be possible to resolve all dependencies at boot time
using nothing but the phandles.
Post by Jon Loeliger
Post by Thierry Reding
For regulators (and regulator-like bindings) the problem is somewhat
more difficult because they property names are not standardized. One way
to solve this would be to look for property names with a -supply suffix,
but that could obviously lead to false positives. One alternative that I
think could eliminate this would be to explicitly list dependencies in
drivers. This would allow core code to step through such a list and
resolve the (phandle, specifier) tuples.
So, express the "additional SW dependencies" in the SW?
Well, not really. They aren't additional dependencies. The problem is
that if we want to use only phandle references to resolve dependencies
(which is a requirement if we don't want to rely on DT to provide extra
metadata), then we need to know what exactly is a phandle. Since the
final DTB will only have a u32 where a phandle was once referenced, we
need to provide the driver core with some way to recover that lost
information. And the best place to do that really is the driver, because
it implements the binding, hence knows exactly what property and cell in
that property contains a phandle value.

So what we'd be expressing in software is hints as to where to find the
list of dependencies.

In addition to that, a lot of boiler-plate code could be eliminated in
drivers by using that same metadata to automatically request the
resources.
Post by Jon Loeliger
Post by Thierry Reding
Clocks are usually not a problem with deferred probing since they often
are registered early anyway.
Ah, but how are they known to be needed early? A toposort should sort
them into that position. That's not currently done. And I doubt the
set of nodes and expressed dependencies would cause them to be done
early enough by today's standards.
They aren't really regular device drivers but rather registered using an
explicit initcall. Clock providers are even initialized before initcalls
are run. The reason is that they are often required for things like SMP
or by other non-driver code that needs to run very early.
Post by Jon Loeliger
Post by Thierry Reding
Regulators on the other hand seem to be a fairly common trigger,
though, so they seem like a good candidate to focus on.
Yeah. And I've seen some debatable IRQ-PHY-PCIe interaction too.
There are probably a couple of these that can be easily identified and
would eliminate a large percentage of deferred probe triggers already.
I found a link to Arnd's original proposal[0] for the devm_probe()
infrastructure and I think that could serve as a useful basis for this.

I would imagine that embedding a pointer to the devm_probe structure
into a struct device_driver and extending it with a way to resolve the
dependencies could work well.

Thierry

[0]: http://lists.infradead.org/pipermail/linux-arm-kernel/2013-November/209031.html
Grant Likely
2014-08-26 08:42:08 UTC
Permalink
Post by Thierry Reding
Post by Jon Loeliger
Post by Thierry Reding
Anyway, instead of going back and forth between "deferred probe is good"
and "deferred probe is bad", how about we do something useful now and
concentrate on how to make use of the information we have in DT with the
goal to reduce the number of cases where deferred probing is required?
Good idea.
The proposal on the table is to allow the probe code
to make a topological sort of the devices based on
dependency information either implied, explicitly stated
or both. That is likely a fundamentally correct approach.
1) What constitutes a dependency?
2) How is that dependency expressed?
3) How do we add missing dependencies?
4) Backward compatability problems.
There are other questions, of course. Is it a topsort
per bus? Are there required "early devices"? Should
the inter-node dependencies be expressed at each node,
or in a separate hierarchy within the DTS? Others.
I think Grant already objected to the idea of explicitly adding
dependency information into the device tree sources. Rather, if I
understand correctly, we should be using the information readily
available (phandle references) as much as possible before resorting to
additional properties.
My objection is primarily around the concern that the dependency data
will get stale if firmware modifies the tree or that the kernel will
break when by relying on incorrect dependency information.

If the kernel can handle incorrect dependencies gracefully (like falling
back to deferred probe) then we can consider additional properties.
Post by Thierry Reding
So far we've been operating under the assumption that a dependency is
modeled as a phandle reference and that the dependent would contain the
phandle reference to the dependency. That's for example how clocks and
regulators (to name only a few) work. A simplified example would look
...
#clock-cells = <1>;
...
};
regulators {
vdd_foo: ldo0 {
...
};
vdd_bar: ldo1 {
...
};
};
vdd-supply = <&vdd_foo>;
clocks = <&clock 0>;
};
vdd-supply = <&vdd_bar>;
clocks = <&clock 1>;
};
There are somewhat standardized bindings for the above and especially
for bindings of the type that clocks implement this is trivial. We can
simply iterate over each (phandle, specifier) tuple and check that the
corresponding clock provider can be resolved (which typically means that
it's been registered with the common clock framework).
For regulators (and regulator-like bindings) the problem is somewhat
more difficult because they property names are not standardized. One way
to solve this would be to look for property names with a -supply suffix,
but that could obviously lead to false positives. One alternative that I
think could eliminate this would be to explicitly list dependencies in
drivers. This would allow core code to step through such a list and
resolve the (phandle, specifier) tuples.
False positives and negatives may not actually be a problem. It is
suboptimal, certainly, but it shouldn't outright break the kernel.

g.

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Thierry Reding
2014-08-26 08:49:24 UTC
Permalink
[...]
Post by Grant Likely
Post by Thierry Reding
There are somewhat standardized bindings for the above and especially
for bindings of the type that clocks implement this is trivial. We can
simply iterate over each (phandle, specifier) tuple and check that the
corresponding clock provider can be resolved (which typically means that
it's been registered with the common clock framework).
For regulators (and regulator-like bindings) the problem is somewhat
more difficult because they property names are not standardized. One way
to solve this would be to look for property names with a -supply suffix,
but that could obviously lead to false positives. One alternative that I
think could eliminate this would be to explicitly list dependencies in
drivers. This would allow core code to step through such a list and
resolve the (phandle, specifier) tuples.
False positives and negatives may not actually be a problem. It is
suboptimal, certainly, but it shouldn't outright break the kernel.
There could be cases where some random integer in a cell could be
interpreted as a phandle and resolve to a struct device_node. I suppose
it might be unlikely, but not impossible, that the device_node could
even match a device in the correct subsystem and you'd get a wrong
dependency. Granted, a wrong dependency may not be catastrophic in that
it won't lead to a crash, but it could lead to various kinds of
weirdness and hard to diagnose problems.

Thierry
Alexander Holler
2014-08-26 09:42:04 UTC
Permalink
Post by Thierry Reding
[...]
Post by Grant Likely
Post by Thierry Reding
There are somewhat standardized bindings for the above and especially
for bindings of the type that clocks implement this is trivial. We can
simply iterate over each (phandle, specifier) tuple and check that the
corresponding clock provider can be resolved (which typically means that
it's been registered with the common clock framework).
For regulators (and regulator-like bindings) the problem is somewhat
more difficult because they property names are not standardized. One way
to solve this would be to look for property names with a -supply suffix,
but that could obviously lead to false positives. One alternative that I
think could eliminate this would be to explicitly list dependencies in
drivers. This would allow core code to step through such a list and
resolve the (phandle, specifier) tuples.
False positives and negatives may not actually be a problem. It is
suboptimal, certainly, but it shouldn't outright break the kernel.
There could be cases where some random integer in a cell could be
interpreted as a phandle and resolve to a struct device_node. I suppose
it might be unlikely, but not impossible, that the device_node could
even match a device in the correct subsystem and you'd get a wrong
dependency. Granted, a wrong dependency may not be catastrophic in that
it won't lead to a crash, but it could lead to various kinds of
weirdness and hard to diagnose problems.
You need either the type information in the DTB (that's why I've add
those "dependencies" to identify phandles), or you need to know every
binding (at "dependency-resolve-time" to identify phandles. The latter
is impracticable to implement in a generic way (for use with every
possible binding).

Regards,

Alexander Holler
Mark Rutland
2014-08-26 10:11:07 UTC
Permalink
Post by Alexander Holler
Post by Thierry Reding
[...]
Post by Grant Likely
Post by Thierry Reding
There are somewhat standardized bindings for the above and especially
for bindings of the type that clocks implement this is trivial. We can
simply iterate over each (phandle, specifier) tuple and check that the
corresponding clock provider can be resolved (which typically means that
it's been registered with the common clock framework).
For regulators (and regulator-like bindings) the problem is somewhat
more difficult because they property names are not standardized. One way
to solve this would be to look for property names with a -supply suffix,
but that could obviously lead to false positives. One alternative that I
think could eliminate this would be to explicitly list dependencies in
drivers. This would allow core code to step through such a list and
resolve the (phandle, specifier) tuples.
False positives and negatives may not actually be a problem. It is
suboptimal, certainly, but it shouldn't outright break the kernel.
There could be cases where some random integer in a cell could be
interpreted as a phandle and resolve to a struct device_node. I suppose
it might be unlikely, but not impossible, that the device_node could
even match a device in the correct subsystem and you'd get a wrong
dependency. Granted, a wrong dependency may not be catastrophic in that
it won't lead to a crash, but it could lead to various kinds of
weirdness and hard to diagnose problems.
You need either the type information in the DTB (that's why I've add
those "dependencies" to identify phandles), or you need to know every
binding (at "dependency-resolve-time" to identify phandles.
While having type information in the DTB would be fantastic, it's not
something we can expect from the systems already in the wild, and I
worry how it would interact with bootloaders that modify the DTB (I
don't know if any modify properties with phandles).
Post by Alexander Holler
The latter is impracticable to implement in a generic way (for use
with every possible binding).
I don't think we necessarily need dependency information for every
binding and driver. We only need dependency information where a device
has a dependency on another device and we don't currently have an
explicit probe ordering guaranteed by Linux.

Where a device driver lacks dependency information and fails to probe,
we can fall back to the current deferred probing.

Do we have any worst case example systems / drivers / dts?

Thanks,
Mark.
Thierry Reding
2014-08-26 10:24:05 UTC
Permalink
Post by Mark Rutland
Post by Alexander Holler
Post by Thierry Reding
[...]
Post by Grant Likely
Post by Thierry Reding
There are somewhat standardized bindings for the above and especially
for bindings of the type that clocks implement this is trivial. We can
simply iterate over each (phandle, specifier) tuple and check that the
corresponding clock provider can be resolved (which typically means that
it's been registered with the common clock framework).
For regulators (and regulator-like bindings) the problem is somewhat
more difficult because they property names are not standardized. One way
to solve this would be to look for property names with a -supply suffix,
but that could obviously lead to false positives. One alternative that I
think could eliminate this would be to explicitly list dependencies in
drivers. This would allow core code to step through such a list and
resolve the (phandle, specifier) tuples.
False positives and negatives may not actually be a problem. It is
suboptimal, certainly, but it shouldn't outright break the kernel.
There could be cases where some random integer in a cell could be
interpreted as a phandle and resolve to a struct device_node. I suppose
it might be unlikely, but not impossible, that the device_node could
even match a device in the correct subsystem and you'd get a wrong
dependency. Granted, a wrong dependency may not be catastrophic in that
it won't lead to a crash, but it could lead to various kinds of
weirdness and hard to diagnose problems.
You need either the type information in the DTB (that's why I've add
those "dependencies" to identify phandles), or you need to know every
binding (at "dependency-resolve-time" to identify phandles.
While having type information in the DTB would be fantastic, it's not
something we can expect from the systems already in the wild, and I
worry how it would interact with bootloaders that modify the DTB (I
don't know if any modify properties with phandles).
Post by Alexander Holler
The latter is impracticable to implement in a generic way (for use
with every possible binding).
I don't think we necessarily need dependency information for every
binding and driver. We only need dependency information where a device
has a dependency on another device and we don't currently have an
explicit probe ordering guaranteed by Linux.
Where a device driver lacks dependency information and fails to probe,
we can fall back to the current deferred probing.
Do we have any worst case example systems / drivers / dts?
Cc'ing Stéphane who's brought this up not long ago. There seem to be
cases where display initialization can be delayed up to 5-6 seconds due
to deferred probing (where the system would otherwise take 5-6 seconds
to boot).

Thierry
Grant Likely
2014-08-27 10:34:32 UTC
Permalink
Post by Mark Rutland
Post by Alexander Holler
Post by Thierry Reding
[...]
Post by Grant Likely
Post by Thierry Reding
There are somewhat standardized bindings for the above and especially
for bindings of the type that clocks implement this is trivial. We can
simply iterate over each (phandle, specifier) tuple and check that the
corresponding clock provider can be resolved (which typically means that
it's been registered with the common clock framework).
For regulators (and regulator-like bindings) the problem is somewhat
more difficult because they property names are not standardized. One way
to solve this would be to look for property names with a -supply suffix,
but that could obviously lead to false positives. One alternative that I
think could eliminate this would be to explicitly list dependencies in
drivers. This would allow core code to step through such a list and
resolve the (phandle, specifier) tuples.
False positives and negatives may not actually be a problem. It is
suboptimal, certainly, but it shouldn't outright break the kernel.
There could be cases where some random integer in a cell could be
interpreted as a phandle and resolve to a struct device_node. I suppose
it might be unlikely, but not impossible, that the device_node could
even match a device in the correct subsystem and you'd get a wrong
dependency. Granted, a wrong dependency may not be catastrophic in that
it won't lead to a crash, but it could lead to various kinds of
weirdness and hard to diagnose problems.
You need either the type information in the DTB (that's why I've add
those "dependencies" to identify phandles), or you need to know every
binding (at "dependency-resolve-time" to identify phandles.
While having type information in the DTB would be fantastic, it's not
something we can expect from the systems already in the wild, and I
worry how it would interact with bootloaders that modify the DTB (I
don't know if any modify properties with phandles).
Anything we do here is firmly in the realm of optimization and
improvement. Adding data to the tree is fine as long as we don't make
the kernel depend on it. Older platforms will continue to work without
the optimization.
Post by Mark Rutland
Post by Alexander Holler
The latter is impracticable to implement in a generic way (for use
with every possible binding).
I don't think we necessarily need dependency information for every
binding and driver. We only need dependency information where a device
has a dependency on another device and we don't currently have an
explicit probe ordering guaranteed by Linux.
Where a device driver lacks dependency information and fails to probe,
we can fall back to the current deferred probing.
Exactly.

g.
Catalin Marinas
2014-08-27 14:44:03 UTC
Permalink
Post by Grant Likely
Post by Mark Rutland
Post by Alexander Holler
Post by Thierry Reding
[...]
Post by Grant Likely
Post by Thierry Reding
There are somewhat standardized bindings for the above and especially
for bindings of the type that clocks implement this is trivial. We can
simply iterate over each (phandle, specifier) tuple and check that the
corresponding clock provider can be resolved (which typically means that
it's been registered with the common clock framework).
For regulators (and regulator-like bindings) the problem is somewhat
more difficult because they property names are not standardized. One way
to solve this would be to look for property names with a -supply suffix,
but that could obviously lead to false positives. One alternative that I
think could eliminate this would be to explicitly list dependencies in
drivers. This would allow core code to step through such a list and
resolve the (phandle, specifier) tuples.
False positives and negatives may not actually be a problem. It is
suboptimal, certainly, but it shouldn't outright break the kernel.
There could be cases where some random integer in a cell could be
interpreted as a phandle and resolve to a struct device_node. I suppose
it might be unlikely, but not impossible, that the device_node could
even match a device in the correct subsystem and you'd get a wrong
dependency. Granted, a wrong dependency may not be catastrophic in that
it won't lead to a crash, but it could lead to various kinds of
weirdness and hard to diagnose problems.
You need either the type information in the DTB (that's why I've add
those "dependencies" to identify phandles), or you need to know every
binding (at "dependency-resolve-time" to identify phandles.
While having type information in the DTB would be fantastic, it's not
something we can expect from the systems already in the wild, and I
worry how it would interact with bootloaders that modify the DTB (I
don't know if any modify properties with phandles).
Anything we do here is firmly in the realm of optimization and
improvement. Adding data to the tree is fine as long as we don't make
the kernel depend on it. Older platforms will continue to work without
the optimization.
It's not just optimisation but an important feature for new arm64 SoCs.
Given some Tegra discussions recently, in many cases the machine_desc
use on arm is primarily to initialise devices in the right order. If we
can solve this in a more deterministic way (other than deferred
probing), we avoid the need for a dedicated SoC platform driver (or
machine_desc) or workarounds like different initcall levels and explicit
DT parsing.
--
Catalin
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Stephen Warren
2014-08-27 16:22:06 UTC
Permalink
Post by Catalin Marinas
Post by Grant Likely
Post by Mark Rutland
Post by Alexander Holler
Post by Thierry Reding
[...]
Post by Grant Likely
Post by Thierry Reding
There are somewhat standardized bindings for the above and especially
for bindings of the type that clocks implement this is trivial. We can
simply iterate over each (phandle, specifier) tuple and check that the
corresponding clock provider can be resolved (which typically means that
it's been registered with the common clock framework).
For regulators (and regulator-like bindings) the problem is somewhat
more difficult because they property names are not standardized. One way
to solve this would be to look for property names with a -supply suffix,
but that could obviously lead to false positives. One alternative that I
think could eliminate this would be to explicitly list dependencies in
drivers. This would allow core code to step through such a list and
resolve the (phandle, specifier) tuples.
False positives and negatives may not actually be a problem. It is
suboptimal, certainly, but it shouldn't outright break the kernel.
There could be cases where some random integer in a cell could be
interpreted as a phandle and resolve to a struct device_node. I suppose
it might be unlikely, but not impossible, that the device_node could
even match a device in the correct subsystem and you'd get a wrong
dependency. Granted, a wrong dependency may not be catastrophic in that
it won't lead to a crash, but it could lead to various kinds of
weirdness and hard to diagnose problems.
You need either the type information in the DTB (that's why I've add
those "dependencies" to identify phandles), or you need to know every
binding (at "dependency-resolve-time" to identify phandles.
While having type information in the DTB would be fantastic, it's not
something we can expect from the systems already in the wild, and I
worry how it would interact with bootloaders that modify the DTB (I
don't know if any modify properties with phandles).
Anything we do here is firmly in the realm of optimization and
improvement. Adding data to the tree is fine as long as we don't make
the kernel depend on it. Older platforms will continue to work without
the optimization.
It's not just optimisation but an important feature for new arm64 SoCs.
Given some Tegra discussions recently, in many cases the machine_desc
use on arm is primarily to initialise devices in the right order. If we
can solve this in a more deterministic way (other than deferred
probing), we avoid the need for a dedicated SoC platform driver (or
machine_desc) or workarounds like different initcall levels and explicit
DT parsing.
A lot of the ordering is SW driver dependencies. I'm not sure how much
of that can accurately be claimed as HW dependencies. As such, I'm not
sure that putting dependencies into DT would be a good idea; it doesn't
feel like HW data, and might well change if we restructure SW. It'd need
some detailed research though.
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alexander Holler
2014-08-27 16:30:54 UTC
Permalink
Post by Stephen Warren
Post by Catalin Marinas
It's not just optimisation but an important feature for new arm64 SoCs.
Given some Tegra discussions recently, in many cases the machine_desc
use on arm is primarily to initialise devices in the right order. If we
can solve this in a more deterministic way (other than deferred
probing), we avoid the need for a dedicated SoC platform driver (or
machine_desc) or workarounds like different initcall levels and explicit
DT parsing.
A lot of the ordering is SW driver dependencies. I'm not sure how much
of that can accurately be claimed as HW dependencies. As such, I'm not
sure that putting dependencies into DT would be a good idea; it doesn't
feel like HW data, and might well change if we restructure SW. It'd need
some detailed research though.
Almost every phandle is a dependency, so the DT is already full with them.
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Stephen Warren
2014-08-27 16:37:58 UTC
Permalink
Post by Alexander Holler
Post by Stephen Warren
Post by Catalin Marinas
It's not just optimisation but an important feature for new arm64 SoCs.
Given some Tegra discussions recently, in many cases the machine_desc
use on arm is primarily to initialise devices in the right order. If we
can solve this in a more deterministic way (other than deferred
probing), we avoid the need for a dedicated SoC platform driver (or
machine_desc) or workarounds like different initcall levels and explicit
DT parsing.
A lot of the ordering is SW driver dependencies. I'm not sure how much
of that can accurately be claimed as HW dependencies. As such, I'm not
sure that putting dependencies into DT would be a good idea; it doesn't
feel like HW data, and might well change if we restructure SW. It'd need
some detailed research though.
Almost every phandle is a dependency, so the DT is already full with them.
That's true, but not entirely relevant.

phandles in DT should only be present where there's an obvious HW
dependency. It's obvious that, for example, there's a real HW dependency
between an IRQ controller and a device that has an IRQ signal fed into
the IRQ controller. It makes perfect sense to represent that as a
phandle (+args).

However, most of the ordering imposed by the Tegra machine descriptor
callbacks is nothing to do with this. It's more that the SW driver for
component X needs some low level data (e.g. SKU/fuse information) before
it can run. However, there's no real HW dependency between the HW
component X and the fuse module. As such, it doesn't make sense to
represent such a dependency in DT, using a phandle or by any other means.

Of course, there are probably cases where we could/should add some more
phandles/... and likewise cases where we shouldn't. That's why detailed
research is needed.

Irrespective though, a new kernel needs to work against an old DT, so
always needs to work without any (of these new) dependencies being
represented in DT, since they aren't represented there today. So, I
think pushing the issue into DT is a non-starter either way, unless we
accept yet another ABI-breaking change, in which case we should just
give up on any claims of ABI and make everyone's lives simpler.
Alexander Holler
2014-08-27 16:58:54 UTC
Permalink
Post by Stephen Warren
Post by Alexander Holler
Post by Stephen Warren
Post by Catalin Marinas
It's not just optimisation but an important feature for new arm64 SoCs.
Given some Tegra discussions recently, in many cases the machine_desc
use on arm is primarily to initialise devices in the right order. If we
can solve this in a more deterministic way (other than deferred
probing), we avoid the need for a dedicated SoC platform driver (or
machine_desc) or workarounds like different initcall levels and explicit
DT parsing.
A lot of the ordering is SW driver dependencies. I'm not sure how much
of that can accurately be claimed as HW dependencies. As such, I'm not
sure that putting dependencies into DT would be a good idea; it doesn't
feel like HW data, and might well change if we restructure SW. It'd need
some detailed research though.
Almost every phandle is a dependency, so the DT is already full with them.
That's true, but not entirely relevant.
phandles in DT should only be present where there's an obvious HW
dependency. It's obvious that, for example, there's a real HW dependency
between an IRQ controller and a device that has an IRQ signal fed into
the IRQ controller. It makes perfect sense to represent that as a
phandle (+args).
However, most of the ordering imposed by the Tegra machine descriptor
callbacks is nothing to do with this. It's more that the SW driver for
component X needs some low level data (e.g. SKU/fuse information) before
it can run. However, there's no real HW dependency between the HW
component X and the fuse module. As such, it doesn't make sense to
represent such a dependency in DT, using a phandle or by any other means.
Of course, there are probably cases where we could/should add some more
phandles/... and likewise cases where we shouldn't. That's why detailed
research is needed.
Irrespective though, a new kernel needs to work against an old DT, so
always needs to work without any (of these new) dependencies being
represented in DT, since they aren't represented there today. So, I
think pushing the issue into DT is a non-starter either way, unless we
accept yet another ABI-breaking change, in which case we should just
give up on any claims of ABI and make everyone's lives simpler.
If I hear research, my response is usually "how many years"?

Fact is that there are already a lot of usable dependencies in the DT,
they just didn't find their way into the kernel and weren't used.

And I think it doesn't help much to make the picture more complicated
than it already is. Solve one step by another and not try to solve
everything at once.

So first enable the kernel to use dependencies at all. I've shown that
it doesn't need magic to do so. Afterwards you can extend or change the
existing solution. It's not always the best approach, but for
complicated things it often doesn't make sense trying to solve
everything at first.

Of course, my approach isn't perfect, but at least it is something
people can already use to play with.

Ok, the way how my patches do handle devices (not drivers) might be
completely wrong, but that's just because I've never got in contact with
the device-model before, it always just worked. So I haven't spend any
time to look into that before and I didn't spend much time to look into
that for my patches (I just discoverd that device-handling by drivers
looks sometimes awkward). I was happy with what I've achieved in the
short time I've spend, and therfor posted the patches to give other
people an easy possibility to try the stuff themself.

Regards,

Alexander Holler
Catalin Marinas
2014-08-27 17:52:43 UTC
Permalink
Post by Stephen Warren
Post by Alexander Holler
Post by Stephen Warren
Post by Catalin Marinas
It's not just optimisation but an important feature for new arm64 SoCs.
Given some Tegra discussions recently, in many cases the machine_desc
use on arm is primarily to initialise devices in the right order. If we
can solve this in a more deterministic way (other than deferred
probing), we avoid the need for a dedicated SoC platform driver (or
machine_desc) or workarounds like different initcall levels and explicit
DT parsing.
A lot of the ordering is SW driver dependencies. I'm not sure how much
of that can accurately be claimed as HW dependencies. As such, I'm not
sure that putting dependencies into DT would be a good idea; it doesn't
feel like HW data, and might well change if we restructure SW. It'd need
some detailed research though.
Almost every phandle is a dependency, so the DT is already full with them.
That's true, but not entirely relevant.
phandles in DT should only be present where there's an obvious HW
dependency. It's obvious that, for example, there's a real HW dependency
between an IRQ controller and a device that has an IRQ signal fed into
the IRQ controller. It makes perfect sense to represent that as a
phandle (+args).
Other examples are power controllers or some MFD device (as we have on
vexpress). For these we normally have phandles.
Post by Stephen Warren
However, most of the ordering imposed by the Tegra machine descriptor
callbacks is nothing to do with this. It's more that the SW driver for
component X needs some low level data (e.g. SKU/fuse information) before
it can run. However, there's no real HW dependency between the HW
component X and the fuse module. As such, it doesn't make sense to
represent such a dependency in DT, using a phandle or by any other means.
But isn't fuse some piece of hardware? We don't have a model for it, so
I guess you just present it as a library that accesses the hardware.
Anyway, in such case something like Pawel's SoC driver proposal would
work. Now if anything inside the SoC bus (I have to re-read, I don't
fully remember the details) is probed after the SoC driver, you could
even initialise your SoC at device_initcall() level.
Post by Stephen Warren
Irrespective though, a new kernel needs to work against an old DT,
I fully agree. But we shouldn't really extend the "old DT" statement to
a new ARMv8 SoC ;).
--
Catalin
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alexander Holler
2014-08-27 18:14:01 UTC
Permalink
Post by Catalin Marinas
Post by Stephen Warren
Irrespective though, a new kernel needs to work against an old DT,
I fully agree. But we shouldn't really extend the "old DT" statement to
a new ARMv8 SoC ;).
Or any new v7 SoC. And even poor users of current ARM HW do want use
their HW. And they don't care if they have to change the DT if they
finally are able to use their board, which happens seldom enough. (I'm
not speaking about companies which are able to spend many man-years to
fix one kernel version for use with one specific HW).

Regards,

Alexander Holler
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alexander Holler
2014-08-28 06:50:36 UTC
Permalink
Post by Stephen Warren
Of course, there are probably cases where we could/should add some more
phandles/... and likewise cases where we shouldn't. That's why detailed
research is needed.
Just because I'm curious, I wonder how this research does or shoud look
like.

Defered probes did come to light with 3.4, that was more than 2 years
ago. Ok, most people (like me) just noticed it during the last months
when they switched to DT and have run into a problem (the deferred probe
mechanism is an error-message killer), but some must have seen it
already 2 years ago.

And I wonder how the ACPI world solves that problem. My guess would be
hardcoded stuff in the firmware-blob (BIOS), just like it happened with
board files, but I've never seen BIOS code and my knowledge about ACPI
is almost zero. ;)

Regards,

Alexander Holler
Catalin Marinas
2014-08-28 09:23:08 UTC
Permalink
Post by Alexander Holler
Post by Stephen Warren
Of course, there are probably cases where we could/should add some more
phandles/... and likewise cases where we shouldn't. That's why detailed
research is needed.
Just because I'm curious, I wonder how this research does or shoud look
like.
Defered probes did come to light with 3.4, that was more than 2 years
ago. Ok, most people (like me) just noticed it during the last months
when they switched to DT and have run into a problem (the deferred probe
mechanism is an error-message killer), but some must have seen it
already 2 years ago.
And I wonder how the ACPI world solves that problem. My guess would be
hardcoded stuff in the firmware-blob (BIOS), just like it happened with
board files, but I've never seen BIOS code and my knowledge about ACPI
is almost zero. ;)
ACPI doesn't even attempt to solve such problems at the OS level. SoCs
aimed at ACPI should have simple hardware configuration (from a Linux
perspective) initialised by firmware (e.g. clocks) and devices living on
a standard bus like PCIe. If a SoC requires specific low-level code to
initialise the hardware in a specific order (other than architected
peripherals like GIC, timers; e.g. MFD devices), we should deem it
unsuitable for ACPI.

ACPI should only be used by vendors who know exactly why they need and
how to implement it properly and not just because the marketing
department told them to (it would also be nice if the Linux kernel
community was informed about such reasons).
--
Catalin
Alexander Holler
2014-08-29 01:43:26 UTC
Permalink
Post by Catalin Marinas
Post by Alexander Holler
And I wonder how the ACPI world solves that problem. My guess would be
hardcoded stuff in the firmware-blob (BIOS), just like it happened with
board files, but I've never seen BIOS code and my knowledge about ACPI
is almost zero. ;)
ACPI doesn't even attempt to solve such problems at the OS level. SoCs
aimed at ACPI should have simple hardware configuration (from a Linux
perspective) initialised by firmware (e.g. clocks) and devices living on
a standard bus like PCIe. If a SoC requires specific low-level code to
initialise the hardware in a specific order (other than architected
peripherals like GIC, timers; e.g. MFD devices), we should deem it
unsuitable for ACPI.
ACPI should only be used by vendors who know exactly why they need and
how to implement it properly and not just because the marketing
department told them to (it would also be nice if the Linux kernel
community was informed about such reasons).
Hmm, Jon Masters from Red Hat sounds too like UEFI/ACPI is the way to go.

But maybe he's right and hiding ugly code in proprietary/binary firmware
blobs where no one can see and criticize it is really the way to go.

Personally I think it's a way back to the past ("when everything was
better", at least that is what most human brains do like to suggest) and
just a dream. And I don't believe that the stuff which was and is hidden
away always just works and doesn't need fixes (one can't do themself).
In my humble opinion that's a nice but wrong myth and doesn't take into
consideration that HW (and SW) gets more and more complicated too (and
thus more faulty).

And even the biggest companies already have problems to produce stuff
which just works (they just employ humans too), not to speak from
smaller board vendors. So it's much better to keep the source as visible
as possible to as much people as possible (in the OS) and not to rely on
some binary blob as most BIOSes are.

Regards,

Alexander Holler
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Thierry Reding
2014-08-26 10:25:04 UTC
Permalink
Post by Thierry Reding
[...]
Post by Grant Likely
Post by Thierry Reding
There are somewhat standardized bindings for the above and especially
for bindings of the type that clocks implement this is trivial. We can
simply iterate over each (phandle, specifier) tuple and check that the
corresponding clock provider can be resolved (which typically means that
it's been registered with the common clock framework).
For regulators (and regulator-like bindings) the problem is somewhat
more difficult because they property names are not standardized. One way
to solve this would be to look for property names with a -supply suffix,
but that could obviously lead to false positives. One alternative that I
think could eliminate this would be to explicitly list dependencies in
drivers. This would allow core code to step through such a list and
resolve the (phandle, specifier) tuples.
False positives and negatives may not actually be a problem. It is
suboptimal, certainly, but it shouldn't outright break the kernel.
There could be cases where some random integer in a cell could be
interpreted as a phandle and resolve to a struct device_node. I suppose
it might be unlikely, but not impossible, that the device_node could
even match a device in the correct subsystem and you'd get a wrong
dependency. Granted, a wrong dependency may not be catastrophic in that
it won't lead to a crash, but it could lead to various kinds of
weirdness and hard to diagnose problems.
You need either the type information in the DTB (that's why I've add those
"dependencies" to identify phandles), or you need to know every binding (at
"dependency-resolve-time" to identify phandles. The latter is impracticable
to implement in a generic way (for use with every possible binding).
Like I already mentioned, this could be done in drivers who contain that
information already anyway. Or parts of it could be done in subsystem-
specific callbacks where a generic binding is available.

Thierry
Alexander Holler
2014-08-26 10:44:35 UTC
Permalink
Post by Thierry Reding
You need either the type information in the DTB (that's why I've add those
"dependencies" to identify phandles), or you need to know every binding (at
"dependency-resolve-time" to identify phandles. The latter is impracticable
to implement in a generic way (for use with every possible binding).
Like I already mentioned, this could be done in drivers who contain that
information already anyway. Or parts of it could be done in subsystem-
specific callbacks where a generic binding is available.
That would end up with almost the same ugly driver-based workarounds as
now. It's much better if a driver author only has to define it's
prerequisits (in form of dependencies in the dts) and could be sure the
driver will only be probed if those are met, than to do that stuff based
on a subsystem or even driver level.

If you add dependency-information to drivers, you have two problems:

- How do you get these information from the driver (remember, currently
there is only one initial call, a initcall which might do almost anything)

- These information might become outdated and you would have to change
all drivers. E.g. if the name of a dependency (driver) changes it
wouldn't be done with changing the dts (maybe plural), but you would
have to change the source of all dependant drivers too.

Regards,

Alexander Holler
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alexander Holler
2014-08-26 11:01:29 UTC
Permalink
Post by Alexander Holler
Post by Thierry Reding
You need either the type information in the DTB (that's why I've add those
"dependencies" to identify phandles), or you need to know every binding (at
"dependency-resolve-time" to identify phandles. The latter is impracticable
to implement in a generic way (for use with every possible binding).
Like I already mentioned, this could be done in drivers who contain that
information already anyway. Or parts of it could be done in subsystem-
specific callbacks where a generic binding is available.
That would end up with almost the same ugly driver-based workarounds as
now. It's much better if a driver author only has to define it's
prerequisits (in form of dependencies in the dts) and could be sure the
driver will only be probed if those are met, than to do that stuff based
on a subsystem or even driver level.
- How do you get these information from the driver (remember, currently
there is only one initial call, a initcall which might do almost anything)
- These information might become outdated and you would have to change
all drivers. E.g. if the name of a dependency (driver) changes it
wouldn't be done with changing the dts (maybe plural), but you would
have to change the source of all dependant drivers too.
And after having sorted my brain:

A driver depends on a binding (and its API), but not on explicit named
other drivers. So trying it (again) on driver level is doomed to fail.

Alexander Holler
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Thierry Reding
2014-08-26 11:08:28 UTC
Permalink
Post by Thierry Reding
You need either the type information in the DTB (that's why I've add those
"dependencies" to identify phandles), or you need to know every binding (at
"dependency-resolve-time" to identify phandles. The latter is impracticable
to implement in a generic way (for use with every possible binding).
Like I already mentioned, this could be done in drivers who contain that
information already anyway. Or parts of it could be done in subsystem-
specific callbacks where a generic binding is available.
That would end up with almost the same ugly driver-based workarounds as now.
It's much better if a driver author only has to define it's prerequisits (in
form of dependencies in the dts) and could be sure the driver will only be
probed if those are met, than to do that stuff based on a subsystem or even
driver level.
We already have all that dependency information in drivers anyway. Each
driver requests the resources at .probe() time. What I proposed (it was
really Arnd who proposed it first) is to move that information out of
code and into some sort of table that could be used by the driver core
to figure out dependencies.
- How do you get these information from the driver (remember, currently
there is only one initial call, a initcall which might do almost anything)
While I don't think it's necessary, that's something that could be
changed. I mean, we have access to the full source code of this
operating system, so we can change every aspect of it. If we can't find
a way to make this work with the current initcall sequence it's always
an option to extend that sequence so that it meets our needs.
- These information might become outdated and you would have to change all
drivers. E.g. if the name of a dependency (driver) changes it wouldn't be
done with changing the dts (maybe plural), but you would have to change the
source of all dependant drivers too.
No. Drivers implement a DT binding. That binding defines what power
supplies, clocks, pinmux, ... the device needs. Those constitute the
dependencies. We most certainly don't want to depend on driver names
since there can be a multitude of different drivers that provide a given
dependency.

What drivers should provide (and what they already provide today) is the
name of the property and the index of the cell that they expect to find
a phandle in as well as the type of the phandle. That's all that's
necessary, really. Everything else can be derived from that phandle and
the type.

Thierry
Alexander Holler
2014-08-26 11:23:54 UTC
Permalink
Post by Thierry Reding
Post by Thierry Reding
You need either the type information in the DTB (that's why I've add those
"dependencies" to identify phandles), or you need to know every binding (at
"dependency-resolve-time" to identify phandles. The latter is impracticable
to implement in a generic way (for use with every possible binding).
Like I already mentioned, this could be done in drivers who contain that
information already anyway. Or parts of it could be done in subsystem-
specific callbacks where a generic binding is available.
That would end up with almost the same ugly driver-based workarounds as now.
It's much better if a driver author only has to define it's prerequisits (in
form of dependencies in the dts) and could be sure the driver will only be
probed if those are met, than to do that stuff based on a subsystem or even
driver level.
We already have all that dependency information in drivers anyway. Each
driver requests the resources at .probe() time. What I proposed (it was
really Arnd who proposed it first) is to move that information out of
code and into some sort of table that could be used by the driver core
to figure out dependencies.
- How do you get these information from the driver (remember, currently
there is only one initial call, a initcall which might do almost anything)
While I don't think it's necessary, that's something that could be
changed. I mean, we have access to the full source code of this
operating system, so we can change every aspect of it. If we can't find
a way to make this work with the current initcall sequence it's always
an option to extend that sequence so that it meets our needs.
- These information might become outdated and you would have to change all
drivers. E.g. if the name of a dependency (driver) changes it wouldn't be
done with changing the dts (maybe plural), but you would have to change the
source of all dependant drivers too.
No. Drivers implement a DT binding. That binding defines what power
supplies, clocks, pinmux, ... the device needs. Those constitute the
dependencies. We most certainly don't want to depend on driver names
since there can be a multitude of different drivers that provide a given
dependency.
What drivers should provide (and what they already provide today) is the
name of the property and the index of the cell that they expect to find
a phandle in as well as the type of the phandle. That's all that's
necessary, really. Everything else can be derived from that phandle and
the type.
Drivers don't provide that information (dependencies) in any usable way.
And as you said yourself, it's already contained in phandles. So what we
are discussing here about? The proposal to use phandles for that is
already on the table since several month. ;)

Sorry, but I don't understand what you want to propose.

Regards,

Alexander Holler
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Thierry Reding
2014-08-26 11:47:19 UTC
Permalink
Post by Thierry Reding
Post by Thierry Reding
You need either the type information in the DTB (that's why I've add those
"dependencies" to identify phandles), or you need to know every binding (at
"dependency-resolve-time" to identify phandles. The latter is impracticable
to implement in a generic way (for use with every possible binding).
Like I already mentioned, this could be done in drivers who contain that
information already anyway. Or parts of it could be done in subsystem-
specific callbacks where a generic binding is available.
That would end up with almost the same ugly driver-based workarounds as now.
It's much better if a driver author only has to define it's prerequisits (in
form of dependencies in the dts) and could be sure the driver will only be
probed if those are met, than to do that stuff based on a subsystem or even
driver level.
We already have all that dependency information in drivers anyway. Each
driver requests the resources at .probe() time. What I proposed (it was
really Arnd who proposed it first) is to move that information out of
code and into some sort of table that could be used by the driver core
to figure out dependencies.
- How do you get these information from the driver (remember, currently
there is only one initial call, a initcall which might do almost anything)
While I don't think it's necessary, that's something that could be
changed. I mean, we have access to the full source code of this
operating system, so we can change every aspect of it. If we can't find
a way to make this work with the current initcall sequence it's always
an option to extend that sequence so that it meets our needs.
- These information might become outdated and you would have to change all
drivers. E.g. if the name of a dependency (driver) changes it wouldn't be
done with changing the dts (maybe plural), but you would have to change the
source of all dependant drivers too.
No. Drivers implement a DT binding. That binding defines what power
supplies, clocks, pinmux, ... the device needs. Those constitute the
dependencies. We most certainly don't want to depend on driver names
since there can be a multitude of different drivers that provide a given
dependency.
What drivers should provide (and what they already provide today) is the
name of the property and the index of the cell that they expect to find
a phandle in as well as the type of the phandle. That's all that's
necessary, really. Everything else can be derived from that phandle and
the type.
Drivers don't provide that information (dependencies) in any usable way. And
as you said yourself, it's already contained in phandles. So what we are
discussing here about? The proposal to use phandles for that is already on
the table since several month. ;)
Sorry, but I don't understand what you want to propose.
In many cases we simply don't know where phandles are stored since we
don't have the type information in DT. But drivers already know the type
of a specific phandle and where to get it from, so the proposal is to
make that knowledge more generally useful so that it can be used for
dependency resolution.

Thierry
Alexander Holler
2014-08-26 12:00:40 UTC
Permalink
Post by Thierry Reding
Post by Thierry Reding
Post by Thierry Reding
You need either the type information in the DTB (that's why I've add those
"dependencies" to identify phandles), or you need to know every binding (at
"dependency-resolve-time" to identify phandles. The latter is impracticable
to implement in a generic way (for use with every possible binding).
Like I already mentioned, this could be done in drivers who contain that
information already anyway. Or parts of it could be done in subsystem-
specific callbacks where a generic binding is available.
That would end up with almost the same ugly driver-based workarounds as now.
It's much better if a driver author only has to define it's prerequisits (in
form of dependencies in the dts) and could be sure the driver will only be
probed if those are met, than to do that stuff based on a subsystem or even
driver level.
We already have all that dependency information in drivers anyway. Each
driver requests the resources at .probe() time. What I proposed (it was
really Arnd who proposed it first) is to move that information out of
code and into some sort of table that could be used by the driver core
to figure out dependencies.
- How do you get these information from the driver (remember, currently
there is only one initial call, a initcall which might do almost anything)
While I don't think it's necessary, that's something that could be
changed. I mean, we have access to the full source code of this
operating system, so we can change every aspect of it. If we can't find
a way to make this work with the current initcall sequence it's always
an option to extend that sequence so that it meets our needs.
- These information might become outdated and you would have to change all
drivers. E.g. if the name of a dependency (driver) changes it wouldn't be
done with changing the dts (maybe plural), but you would have to change the
source of all dependant drivers too.
No. Drivers implement a DT binding. That binding defines what power
supplies, clocks, pinmux, ... the device needs. Those constitute the
dependencies. We most certainly don't want to depend on driver names
since there can be a multitude of different drivers that provide a given
dependency.
What drivers should provide (and what they already provide today) is the
name of the property and the index of the cell that they expect to find
a phandle in as well as the type of the phandle. That's all that's
necessary, really. Everything else can be derived from that phandle and
the type.
Drivers don't provide that information (dependencies) in any usable way. And
as you said yourself, it's already contained in phandles. So what we are
discussing here about? The proposal to use phandles for that is already on
the table since several month. ;)
Sorry, but I don't understand what you want to propose.
In many cases we simply don't know where phandles are stored since we
don't have the type information in DT. But drivers already know the type
of a specific phandle and where to get it from, so the proposal is to
make that knowledge more generally useful so that it can be used for
dependency resolution.
How?

Anyway, I'm leaving this discussion. I've already made a proposal which
solved most mentioned problems (imho) and even offered usable patches
(ok, they suffer under the "not invented here" syndrom, but ...). ;)

But please continue this discussion, I will try to not disturb it anymore.

Regards,

Alexander Holler

--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Jon Loeliger
2014-08-26 13:58:34 UTC
Permalink
Post by Alexander Holler
Post by Thierry Reding
Drivers don't provide that information (dependencies) in any usable way. And
as you said yourself, it's already contained in phandles. So what we are
discussing here about? The proposal to use phandles for that is already on
the table since several month. ;)
Sorry, but I don't understand what you want to propose.
In many cases we simply don't know where phandles are stored since we
don't have the type information in DT. But drivers already know the type
of a specific phandle and where to get it from, so the proposal is to
make that knowledge more generally useful so that it can be used for
dependency resolution.
How?
Is the issue around which we are dancing here the timing of
topsort and the probing? When the driver is probed, sure, it
touches and resolves a bunch of phandles and references other
nodes and devices. But that is at probe time, and it only has
the context of itself then.

I think we need to do the complete topsort *before* we attempt
to do any probing. So three steps:

1) Graph Construction
Add a new "emit dependencies" function to driver bindings.
Iterate over known devices or nodes in the DT in any order.
Call the "emit dependencies" function. It adds all
dependency edges to a global graph by knowing what
phandles or other pieces it will need.
A driver with no "emit dependencies" function can be
added to the graph anywhere without loss of generality.
Add any additional edges for whatever reason.

2) Topsort the generated driver graph

3) Call probe for real in topsort order

Alexander, I don't recall the details of your patch series.
Can you please remind us if it took this approach in the kernel?
Post by Alexander Holler
Anyway, I'm leaving this discussion. I've already made a proposal
which solved most mentioned problems (imho) and even offered usable
patches
Darn. I think you clearly have a pony in this race, and it
would be good if you still participated. Really.
Post by Alexander Holler
(ok, they suffer under the "not invented here" syndrom, but ...). ;)
There isn't a single thing in the entire Linux Kernel community
that was "invented here"; every aspect of it was NIH'ed by *someone*.
That's how it gets built, changed, maintained, fixed, etc.
Post by Alexander Holler
But please continue this discussion, I will try to not disturb it anymore.
I'm sorry to hear that.
Post by Alexander Holler
Regards,
Alexander Holler
HTH,
jdl
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Thierry Reding
2014-08-26 14:17:16 UTC
Permalink
Post by Jon Loeliger
Post by Thierry Reding
Drivers don't provide that information (dependencies) in any usable way. And
as you said yourself, it's already contained in phandles. So what we are
discussing here about? The proposal to use phandles for that is already on
the table since several month. ;)
Sorry, but I don't understand what you want to propose.
In many cases we simply don't know where phandles are stored since we
don't have the type information in DT. But drivers already know the type
of a specific phandle and where to get it from, so the proposal is to
make that knowledge more generally useful so that it can be used for
dependency resolution.
How?
Is the issue around which we are dancing here the timing of
topsort and the probing? When the driver is probed, sure, it
touches and resolves a bunch of phandles and references other
nodes and devices. But that is at probe time, and it only has
the context of itself then.
I think we need to do the complete topsort *before* we attempt
1) Graph Construction
Add a new "emit dependencies" function to driver bindings.
Iterate over known devices or nodes in the DT in any order.
Call the "emit dependencies" function. It adds all
dependency edges to a global graph by knowing what
phandles or other pieces it will need.
A driver with no "emit dependencies" function can be
added to the graph anywhere without loss of generality.
Add any additional edges for whatever reason.
2) Topsort the generated driver graph
3) Call probe for real in topsort order
Yes, I think that makes a lot of sense. We need to provide a way to make
the dependency information available before probe time, otherwise we
don't gain anything. Whether we provide that in a form of a function
call or a table is an implementation detail.

I do think that requiring drivers to provide a function is going to make
things more complicated than necessary since that "emit dependencies"
function would need to copy a lot of the things that .probe() does
already. Sharing this information in a table sounds like a good idea. An
"emit dependencies" function in the core can use that data to resolve
dependencies whereas the driver core can equally use that information to
request the devices so that the drivers don't have to do so.

Thierry
Alexander Holler
2014-08-27 07:16:23 UTC
Permalink
Post by Jon Loeliger
I think we need to do the complete topsort *before* we attempt
1) Graph Construction
Add a new "emit dependencies" function to driver bindings.
Iterate over known devices or nodes in the DT in any order.
Call the "emit dependencies" function. It adds all
dependency edges to a global graph by knowing what
phandles or other pieces it will need.
A driver with no "emit dependencies" function can be
added to the graph anywhere without loss of generality.
Add any additional edges for whatever reason.
2) Topsort the generated driver graph
3) Call probe for real in topsort order
Alexander, I don't recall the details of your patch series.
Can you please remind us if it took this approach in the kernel?
Post by Alexander Holler
Anyway, I'm leaving this discussion. I've already made a proposal
which solved most mentioned problems (imho) and even offered usable
patches
Why should I? I've posted patches along with a lot of comments and
explanations, and e.g. you are just talking that it should be made like
my patches already did. And others do talk too like my patches and the
accompanying comments from me which explain most stuff never have
existed and just repeat what the patches already do without refering to
them.
Post by Jon Loeliger
Darn. I think you clearly have a pony in this race, and it
would be good if you still participated. Really.
Thanks. But I don't see it as a race and I don't want to take part in a
race (and I usually avoid those silly contests which have become chic in
the IT world). I just offered a solution (or at least a working starting
point to a solution).
Post by Jon Loeliger
Post by Alexander Holler
(ok, they suffer under the "not invented here" syndrom, but ...). ;)
There isn't a single thing in the entire Linux Kernel community
that was "invented here"; every aspect of it was NIH'ed by *someone*.
That's how it gets built, changed, maintained, fixed, etc.
Might be true in an ideal open source world and might have been true for
past kernel development when most people weren't full time kernel
developers. But nowadays it appears to me like many parts of the kernel
have become in the hands of closed groups. And they build and enforce
hurdles that high, that nobody else can take them without spending an
idiotic amount of time. Just like many other "consortiums" do, you only
have to build enough rules to protect from the outside while still
looking open.

E.g. an example I've seen often is that someone spend a lot of time to
examine and fix a bug and write a commented patch. And the only response
from the maintainer was that he should add an emtpy line before a return
statement and similiar silly things to enforce patch-ping-pong. Such
just drives people on the other end crazy and they likely won't spend
the time to post another patch (they still might fix other bugs, but
just for themself).

Regards,

Alexander Holler
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alexander Holler
2014-08-27 09:26:59 UTC
Permalink
Post by Alexander Holler
Why should I? I've posted patches along with a lot of comments and
explanations, and e.g. you are just talking that it should be made like
my patches already did. And others do talk too like my patches and the
accompanying comments from me which explain most stuff never have
existed and just repeat what the patches already do without refering to
them.
Just to repeat myself:

These patches which started this thread are not just some ideas without
any sense for the amount of work necessary to implement them (as seen so
often).

These patches are real working code everyone can apply to the mentioned
kernel version and see what happens with his board. They are even
checkpatched to avoid bean counting discussion.

(Don't forget to use patch 10/9 too)

Regards,

Alexander Holler
Alexander Holler
2014-08-26 07:56:43 UTC
Permalink
Post by Jon Loeliger
Post by Thierry Reding
Anyway, instead of going back and forth between "deferred probe is good"
and "deferred probe is bad", how about we do something useful now and
concentrate on how to make use of the information we have in DT with the
goal to reduce the number of cases where deferred probing is required?
Good idea.
The proposal on the table is to allow the probe code
to make a topological sort of the devices based on
dependency information either implied, explicitly stated
or both. That is likely a fundamentally correct approach.
1) What constitutes a dependency?
In my patches phandles are used. That works pretty good for almost all
DTs. So almost all dependencies are already declared in a DT and almost
no changes to the DT are necessary. The only binding I've seen where it
doesn't work is remote-endpoint, because that binding isn't a directed
dependency. So one of the two places where such a binding occurs needs a
"no-dependencies = <phandle>" to avoid circular dependencies which can
be solved.
Post by Jon Loeliger
2) How is that dependency expressed?
Already there in form of phandles.
Post by Jon Loeliger
3) How do we add missing dependencies?
My patches offer the possibility to extend or reduce the list of
(automatically generated) dependencies by using "[no-]dependencies = <
list of phandles >;"
Post by Jon Loeliger
4) Backward compatability problems.
None in my approach. The DT just includes an additional binding to
circumvent the missing but needed type information for phandles.

Regards,

Alexander Holler
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Grant Likely
2014-08-26 08:51:28 UTC
Permalink
Post by Jon Loeliger
Post by Thierry Reding
Anyway, instead of going back and forth between "deferred probe is good"
and "deferred probe is bad", how about we do something useful now and
concentrate on how to make use of the information we have in DT with the
goal to reduce the number of cases where deferred probing is required?
Good idea.
The proposal on the table is to allow the probe code
to make a topological sort of the devices based on
dependency information either implied, explicitly stated
or both. That is likely a fundamentally correct approach.
1) What constitutes a dependency?
2) How is that dependency expressed?
3) How do we add missing dependencies?
4) Backward compatability problems.
There are other questions, of course. Is it a topsort
per bus? Are there required "early devices"? Should
the inter-node dependencies be expressed at each node,
or in a separate hierarchy within the DTS? Others.
Getting the dependency tree I think is only half the problem. The other
have is how to get the driver model to actually order probing using that
list. That problem is hard because the order drivers are probed is
currently determined by the interaction of driver link order, driver
initcall level, and device registration order. The first devices are
registered at an early initcall, before their drivers, and therefore
bind order is primarily determined by initcall level and driver link
order. However, later devices (ie. i2c clients) are registered by the
bus driver (ie. again, i2c) and probe order may be primarily link order
(if the driver is not yet registered) or registration order (if the
driver was registered before the parent bus).

g.
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alexander Holler
2014-08-26 09:56:52 UTC
Permalink
Post by Grant Likely
Post by Jon Loeliger
Post by Thierry Reding
Anyway, instead of going back and forth between "deferred probe is good"
and "deferred probe is bad", how about we do something useful now and
concentrate on how to make use of the information we have in DT with the
goal to reduce the number of cases where deferred probing is required?
Good idea.
The proposal on the table is to allow the probe code
to make a topological sort of the devices based on
dependency information either implied, explicitly stated
or both. That is likely a fundamentally correct approach.
1) What constitutes a dependency?
2) How is that dependency expressed?
3) How do we add missing dependencies?
4) Backward compatability problems.
There are other questions, of course. Is it a topsort
per bus? Are there required "early devices"? Should
the inter-node dependencies be expressed at each node,
or in a separate hierarchy within the DTS? Others.
Topsort by bus wouldn't work. That would imply that nothing uses
something involved with another bus. And early devices are handled fine
by normal dependencies too (as long as they are complete), so there is
no need to make an distinction.
Post by Grant Likely
Getting the dependency tree I think is only half the problem. The other
have is how to get the driver model to actually order probing using that
list. That problem is hard because the order drivers are probed is
currently determined by the interaction of driver link order, driver
initcall level, and device registration order. The first devices are
registered at an early initcall, before their drivers, and therefore
bind order is primarily determined by initcall level and driver link
order. However, later devices (ie. i2c clients) are registered by the
bus driver (ie. again, i2c) and probe order may be primarily link order
(if the driver is not yet registered) or registration order (if the
driver was registered before the parent bus).
That's why I've invented those "well-done"-initcalls. These are
initcalls which just register a driver and don't probe it. Thats
necessary to build a catalog of existing in-kernel-drivers (you have to
know what you are sorting). Fortunately most drivers are already of that
type. And those which aren't can be either just used like before (if it
works) or they can be changed.

Changing them can be done per board (only enable dependency based order
for a board if necessary drivers have changed).

Regards,

Alexander Holler
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alexander Holler
2014-08-26 10:18:35 UTC
Permalink
Post by Grant Likely
Getting the dependency tree I think is only half the problem. The other
have is how to get the driver model to actually order probing using that
list. That problem is hard because the order drivers are probed is
currently determined by the interaction of driver link order, driver
initcall level, and device registration order. The first devices are
registered at an early initcall, before their drivers, and therefore
bind order is primarily determined by initcall level and driver link
order. However, later devices (ie. i2c clients) are registered by the
bus driver (ie. again, i2c) and probe order may be primarily link order
(if the driver is not yet registered) or registration order (if the
driver was registered before the parent bus).
Using my patches, the problem which still exists is how to handle
devices (not drivers). I've build the patches based on the assumption
that device-handling happens automatically. Unfortunately that isn't
really true and device-handling looks awkward. Some drivers build them
themself, some require that a device already exists and some require
that a device doesn't already exist.

But I haven't looked in deep at that. I'm sure that can be fixed by
fixing drivers which do things differently than they should (maybe
because they needed to do such for dirty workarounds because no order
was guaranteed, which wouldn't be true anymore).

Anyway, I've not looked further into that problem (with devices, not
drivers) as it already seems quiet impossible to get the other necessary
stuff into the kernel in a reasonable time (before 32bit-HW which does
use DT will not be available anymore).

Regards,

Alexander Holler
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Mark Rutland
2014-08-26 09:54:59 UTC
Permalink
Post by Thierry Reding
Post by Mark Rutland
Post by Alexander Holler
Post by Thierry Reding
Anyway, those are all fairly standard reasons for where deferred probe
triggers, and since I do like deferred probe for it's simplicity and
reliability I'd rather not try to work around it if boot time is all
that people are concerned about.
It's neither simple nor reliable. It's non deterministic brutforcing
while making it almost impossible to identify real errors.
It's horrible, yes.
Post by Alexander Holler
In my humble opinion the worst way to solve something. I'm pretty sure
if I would have suggest such a solution, the maintainer crowd would have
eaten me without cooking.
We didn't have a better workable solution at the time.
You make it sound like we've come up with a better workable solution in
the meantime.
That wasn't the intention, but my sloppy wording does make it come
across that way.
Post by Thierry Reding
Post by Mark Rutland
Having a hack that got boards booting was considered better than not
having them boot.
I don't remember people being particularly enthralled by the idea.
Odd, I remember things quite differently.
Then perhaps my memory is faulty. :)
Post by Thierry Reding
Anyway, instead of going back and forth between "deferred probe is good"
and "deferred probe is bad", how about we do something useful now and
concentrate on how to make use of the information we have in DT with the
goal to reduce the number of cases where deferred probing is required?
Certainly.

Mark.
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Loading...