Discussion:
[cmake-developers] Compiler features/extensions remaining/future issues
Stephen Kelly
2014-05-31 13:17:59 UTC
Permalink
Hi,

Here is a dump of some notes I have accumulated regarding compile features.

1) Extensions requiring compile options

The target_compile_features interface is designed to allow use with compiler
extensions such as gnu_cxx_typeof and msvc_cxx_sealed. The extensions
discussed so far have been extensions which happen to depend on 'big switch'
options like /Za and -std=gnu++11 vs -std=c++11.

However, there are other cases.

Clang supports msvc_cxx_sealed on all platforms if the -fms-extensions
option is passed:

$ clang++ -fms-extensions main.cpp
main.cpp:353:10: warning: 'sealed' keyword is a Microsoft extension [-
Wmicrosoft]
struct A sealed {};
^
1 warning generated.


It might make sense to allow passing additional options for compiler
extensions which need them. Eg

+set(_cmake_feature_test_msvc_cxx_sealed "${Clang34}")
+set(_cmake_feature_test_msvc_cxx_sealed_compile_option "-fms-extensions")

The patch at

https://www.mail-archive.com/cfe-***@cs.uiuc.edu/msg97160.html

requires -fplan9-extensions. I don't know if it enables any relevant
features, but I note it for completeness.


2) Incompatible features

Two features known to CMake might be incompatible.

For example, the cxx_auto_type feature (c++11) conflicts with a
cxx_auto_storage_type_specifier (c++98). In this case, it is a non-issue
because variables have automatic storage duration by default anyway, and the
feature of 'auto as a storage type specifier' is deprecated in c++11 partly
due to non-use, so no CMake user is likely to have a use for such a thing.

Another example is exported templates, which are removed from c++11:

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2003/n1426.pdf
(Why we can't afford ``export``)

which may conflict with the cxx_extern_templates feature (c++11). However,
only EDG implemented the exported templates feature, and it was only made
available for use with the Comeau compiler. I don't think it makes sense to
add it as a known feature to CMake.

So for now, I don't think incompatible features is an issue, but it may
become one in the future, and the CMake implementation would need some way
to handle that.

Another way that incompatible features could arise is if a compiler supports
a msvc_cxx_foo feature and a gnu_cxx_bar feature which may not be used
together because of compile options which may not be used together.


3) Extensions which may become standard

GNU 4.9 supports explicit template parameter syntax for generic lambdas:

https://gcc.gnu.org/gcc-4.9/changes.html
https://gcc.gnu.org/ml/gcc/2009-08/msg00174.html

int main()
{
// a functional object that will add two objects
auto add = [] (auto a, auto b) { return a + b; };

// Allowed by GNU 4.9 with -std=gnu++1y (and -std=c++1y)
// Adds only like-type objects
auto add_constrained = [] <typename T> (T a, T b) { return a + b; };

// Variadics not allowed:
// auto num_args = [] <typename T...> (T... t) { return sizeof(t...); };

int ret;
ret = add(3, -3);
ret = add_constrained(3, -3);
ret = add(3.0, -3);
// error: no match for call to ‘(main()::<lambda(T, T)>) (double, int)’
// ret = add_constrained(3.0, -3);
return ret;
}

Something like this might be added to the standard together with Concepts in
c++17, as per section 5.4 of

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3418.pdf

So, today we might add a gnu_cxx_lambda_template_parameters feature to CMake
supported by GNU 4.9, but we might add a cxx_lambda_template_parameters in
the future which might subsume or might conflict with the gnu_ variant (if
standard behavior is somewhat different).

I don't see any independent problem with this, but I just thought I'd point
it out. It might lead to 'conflicting features', or we might decide it is an
error to specify both gnu_cxx_lambda_template_parameters and
cxx_lambda_template_parameters in that case.

See also cxx_inline_namespaces and GNU strong namespaces:

https://gcc.gnu.org/onlinedocs/gcc/Namespace-Association.html#Namespace-Association

See also the Modules feature, which is going toward standardization and
currently requires a compile option in the Clang implementation

http://clang.llvm.org/docs/Modules.html

Also, I think GNU allows the c_restrict feature (C11) in c++ mode, which may
become standardized in c++ in the future.


4) WG 21 standing document 6 (study group 10)

Just pointing this out for completeness:

https://isocpp.org/files/papers/n4030.htm

Clang and CMake generally refer to 'cxx' instead of 'cpp', and I think it's
ok for the features known to CMake to continue to have cxx_ prefixes.

The SD6 document might be a useful reference for naming things.

Note though that there are some essential differences. Those recommendations
use a single macro for the c++11 constexpr feature and for the c++14 relaxed
constexpr feature

__cpp_constexpr = 200704 (cxx_constexpr)
__cpp_constexpr = 201304 (cxx_relaxed_constexpr)

As the CMake features are not differentiated in that way, some differences
compared to that document will remain necessary.


5) Disabling features, aka 'Enabling' non-features

I could imagine adding a feature to control whether exceptions are allowed
in compiled code. With GNU there is a -fno-exceptions option which may be
passed to error on use of ``throw``. I believe with MSVC has something
similar.

Would a cxx_no_exceptions feature be a reasonable fit into the compile
features concept, with the corresponding compile option?

Something similar could be said for rtti.


6) target_compile_features as a universal feature interface

If compile features are to be linked in some way with compile options, the
idea of using it for cxx_position_independent_code arises. We already have
an interface in CMake for that, so I'm just listing this for completeness,
and for consideration of how future similar interfaces should be handled. We
might be able to think about that a bit now.

For example cxx_sse2 and cxx_avx features could be added which add
/arch:SSE2 or /arch:AVX for MSVC, and -msse2 or -mavx for GNU

http://stackoverflow.com/questions/661338/sse-sse2-and-sse3-for-gnu-c

This isn't something I think should definitely be done, but is something to
think about.


7) Extending the compiler feature support matrix.

I'm finished with extending the feature support matrix for now.

MSVC features are obviously missing, but someone else will have to add and
maintain those, and try them out to find any issues similar to those
recorded in comments for the GNU and Clang compilers such as discrepancy
between documented and actual features, broken features etc.

Extending the feature matrix to past releases also should be done carefully
(if at all). The c++11 standard evolved over almost a decade, things changed
in that time, and compilers implemented intermediate versions in that time.
For example, there are many versions of 'rvalue references' and MSVC does
not yet implement the accepted version

http://msdn.microsoft.com/en-us/library/hh567368.aspx#rvref

In this case, 'rvalue references v3' has a separate feature in CMake
(cxx_defaulted_move_initializers), so it is likely not an issue, but someone
extending support to that compiler or old releases of it, or old releases of
other compilers would need to check things like that.

http://rrsd.com/blincubator.com/bi_library/afio/
"and auto-generate implicit move constructors when all member data types
have move constructors (known in Microsoft as rvalue references v3.0)"

As well as consider whether 'pure bugs' in compiler releases are severe
enough to disable the feature for that compiler. This is mostly an issue for
older Clang/GNU releases

http://milianw.de/blog/c11-platform-support#comment-1401

and for MSVC releases (I recorded a few serious bugs at

https://gitorious.org/cmake/steveires-cmake/source/0156b7f4:Modules/Compiler/MSVC-CXX-FeatureTests.cmake

)

We previously agreed to treat documented features as available, but as the
compiler feature matrix is currently small, this decision has not yet had to
be made concrete, and could be re-visited if someone had a need to do so.


8) Standard library features

It would be possible to somewhat-selectively record features of the standard
library by including a stdlib header and check features of that by version.

http://thread.gmane.org/gmane.comp.compilers.clang.devel/22916/focus=22917

The GLIBCXX macro is not useful for version checking, but the compiler
macros could possibly be tested instead in that case because of tight-
coupling

http://stackoverflow.com/a/11925468/2428389

Standard library features are in-scope for the SD6 feature testing

https://isocpp.org/files/papers/n4030.htm

and as far as I know, it is in scope for Boost.Config too.

I'm not convinced they should be in scope for CMake however. There would be
too many features drowning out other features (for each class/algorithm?,
c++14 additions of constexpr

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3470.html

etc). We could consider using the SD6 macros for std lib feature detection,
but as those macros are designed to be defined in the header that contains
the feature, we would end up having to compile a header which includes a
large amount of std lib headers at the beginning of CMake time to record the
features, which does not seem worth it.

So I'm not planning to investigate that further.


Thanks,

Steve.
Ben Boeckel
2014-06-08 22:56:11 UTC
Permalink
Post by Stephen Kelly
Here is a dump of some notes I have accumulated regarding compile features.
<snip>

I haven't read this thoroughly, just enough to see that this item is
missing:

9) Performance

I'm seeing considerable performance impact of this feature, even when it
isn't used:

[ Snipping out slower runs (of 3); full file available. All runs were
made with CMP0053 set to NEW, so configure times are mainly due to the
new parser improvements. ]

Running cmake build 3.0.0rc6 from cmake-3.0.0rc6...
Running tests for paraview...
Running make test 1...
XXXXXX Timing of configure: 29.782
XXXXXX Timing of generate: 19.9902
Running ninja test 2...
XXXXXX Timing of configure: 29.4229
XXXXXX Timing of generate: 37.4056
Running tests for vtk...
Running make test 1...
XXXXXX Timing of configure: 14.0793
XXXXXX Timing of generate: 8.14633
Running ninja test 1...
XXXXXX Timing of configure: 13.9817
XXXXXX Timing of generate: 12.9389
Running tests for slicer...
Running make test 2...
XXXXXX Timing of configure: 9.77397
XXXXXX Timing of generate: 31.0064
Running ninja test 2...
XXXXXX Timing of configure: 9.91163
XXXXXX Timing of generate: 40.8335
Running tests for sprokit...
Running make test 3...
XXXXXX Timing of configure: 2.9694
XXXXXX Timing of generate: 2.87538
Running ninja test 1...
XXXXXX Timing of configure: 3.01635
XXXXXX Timing of generate: 0.572799

versus:

Running cmake build master from cmake-master...
Running tests for paraview...
Running make test 3...
XXXXXX Timing of configure: 21.5985
XXXXXX Timing of generate: 35.5566
Running ninja test 2...
XXXXXX Timing of configure: 22.0632
XXXXXX Timing of generate: 59.1115
Running tests for vtk...
Running make test 3...
XXXXXX Timing of configure: 11.2056
XXXXXX Timing of generate: 10.4012
Running ninja test 3...
XXXXXX Timing of configure: 10.8211
XXXXXX Timing of generate: 14.5732
Running tests for slicer...
Running make test 3...
XXXXXX Timing of configure: 7.48876
XXXXXX Timing of generate: 52.8917
Running ninja test 1...
XXXXXX Timing of configure: 7.31795
XXXXXX Timing of generate: 62.4773
Running tests for sprokit...
Running make test 2...
XXXXXX Timing of configure: 3.28055
XXXXXX Timing of generate: 3.44212
Running ninja test 2...
XXXXXX Timing of configure: 3.32206
XXXXXX Timing of generate: 1.22392

with my performance branches on master:

Running cmake build merge from cmake-merge...
Running tests for paraview...
Running make test 3...
XXXXXX Timing of configure: 19.9036
XXXXXX Timing of generate: 30.26
Running ninja test 1...
XXXXXX Timing of configure: 19.1339
XXXXXX Timing of generate: 27.0993
Running tests for vtk...
Running make test 2...
XXXXXX Timing of configure: 8.63752
XXXXXX Timing of generate: 9.45069
Running ninja test 3...
XXXXXX Timing of configure: 8.81416
XXXXXX Timing of generate: 7.74161
Running tests for slicer...
Running make test 2...
XXXXXX Timing of configure: 6.75249
XXXXXX Timing of generate: 44.8809
Running ninja test 2...
XXXXXX Timing of configure: 6.58756
XXXXXX Timing of generate: 38.2476
Running tests for sprokit...
Running make test 2...
XXXXXX Timing of configure: 1.7703
XXXXXX Timing of generate: 2.57902
Running ninja test 3...
XXXXXX Timing of configure: 1.81398
XXXXXX Timing of generate: 0.441557

Which shows that my branches help ninja quite a bit (which makes sense
since that's been my focus so far), but now make is showing regressions
(which ninja probably shares) which were not helped nearly as much.

Looking at callgrind output[1], I'd say that compile features are a
non-trivial amount (10% of /total/ time; same as compile options) of the
added time especially considering that the projects aren't using compile
features at all.

Thanks,

--Ben

[1]The code run was next + my branches.
Stephen Kelly
2014-06-09 07:26:56 UTC
Permalink
Post by Ben Boeckel
I'm seeing considerable performance impact of this feature, even when it
Can you create an sscce?

Are there many static libraries involved?

Thanks,

Steve.
David Cole
2014-06-09 11:46:42 UTC
Permalink
Post by Stephen Kelly
Post by Ben Boeckel
I'm seeing considerable performance impact of this feature, even when it
Can you create an sscce?
Sounds like just downloading ParaView, ITK or Slicer, and configuring
it with CMake is the reproduce case. How much simpler and more
stand-alone do you want it to be?

Ben, can you provide a script that assumes nothing but a CMake install
to download a project and demonstrate the problem?
Ben Boeckel
2014-06-09 13:32:48 UTC
Permalink
Post by David Cole
Ben, can you provide a script that assumes nothing but a CMake install
to download a project and demonstrate the problem?
Attached. Run as:

./test-cmake.sh <cmake root> [<cmake args>...]

the first argument defaults to /usr. All of my tests are done with
reconfigures, not initial configures, so cmake is run twice and the
second run's timing is used.

--Ben
Ben Boeckel
2014-06-09 15:16:49 UTC
Permalink
Post by David Cole
Post by Stephen Kelly
Can you create an sscce?
Not really. The wall time impact is only really visible on sizeable
projects and the jitter in the time can be masked in smaller projects.
The smallest you're probably going to get is VTK without searching for
projects.

Other projects which might be of interest to test for performance impact
would be the larger KDE projects (at least kdelibs, kde-workspace, and
KDevelop are likely), and LLVM.
Post by David Cole
Sounds like just downloading ParaView, ITK or Slicer, and configuring
it with CMake is the reproduce case. How much simpler and more
stand-alone do you want it to be?
Specifically, ParaView should have Python bindings enabled. I haven't
tested ITK and Slicer is the Slicer-build directory inside of the slicer
superbuild (which you get by default from the source tree).

--Ben
Stephen Kelly
2014-06-10 15:30:24 UTC
Permalink
Post by Ben Boeckel
Post by Stephen Kelly
Post by Ben Boeckel
I'm seeing considerable performance impact of this feature, even
when it
Post by Stephen Kelly
Can you create an sscce?
Sounds like just downloading ParaView, ITK or Slicer, and configuring
it with CMake is the reproduce case. How much simpler and more
stand-alone do you want it to be?
I think

http://sscce.org/

explains is quite well.

I want to avoid having to understand all of the ParaView CMake code and that
of its dependencies, and whether python bindings need to be enabled etc.

Ben, can you run your timing test with a commit before my topic and after
it? Timing tests with master and your extra topics don't tell us anything on
this question.

Thanks,

Steve.
Bill Hoffman
2014-06-10 16:57:11 UTC
Permalink
Post by Stephen Kelly
I think
http://sscce.org/
explains is quite well.
I want to avoid having to understand all of the ParaView CMake code and that
of its dependencies, and whether python bindings need to be enabled etc.
The real problem is that we need to have some regression tests in CMake
that test for these types of performance issues. Right now we don't
have them. Creating a SSCCE for the issues we are seeing would be doing
just that.

However, right now we know there are issues with ParaView. It might
not be too much to ask for you to give it a quick try. I am sure Ben
could give you the -D options for CMake since ParaView is pretty self
contained. I think basically building ParaView with Python wrapping
shows the issue. So, you will need python installed and they
Post by Stephen Kelly
Ben, can you run your timing test with a commit before my topic and after
it? Timing tests with master and your extra topics don't tell us anything on
this question.
Do you think you could try ParaView and if you run into trouble give up
on it. However, I would hope it would involve a git clone, and a
cmake -DPARAVIEW_ENABLE_PYTHON=TRUE ../ParaViewSrc

In the mean time, I will see if we can work on trying to create some
better regression tests for CMake performance.

Thanks.


-Bill
Ben Boeckel
2014-06-10 17:17:57 UTC
Permalink
Post by Stephen Kelly
I want to avoid having to understand all of the ParaView CMake code and that
of its dependencies, and whether python bindings need to be enabled etc.
Well, there isn't much you need to grok from the code there; it's just a
project with lots of targets with lots of interdependencies. When adding
compile features, the time drops noticeably.

Brad, Rob, and I looked at the code and performance output today and it
looks like it is the evaluation of the generator expressions made to
find the compile features, compile options, and compile definitions that
takes a long time. *Each* takes ~10% of overall time and the projects
I'm testing don't use the interface propagation features *at all*. It's
the evaluation of the generated genex that takes a long time (parsing is
now inconsequential). Using debugging messages shows that the
evaluations are being cached, but that probably only saves us from even
more slowdowns.

Brad is going to take a deeper look at it and might have more
information in the next few days.
Post by Stephen Kelly
Ben, can you run your timing test with a commit before my topic and after
it? Timing tests with master and your extra topics don't tell us anything on
this question.
Will do. Should be done in an hour or two. I'm using commits b56a9ae (before)
and 593b69c (after):

commit b56a9ae7f14189fd2bce2ca3e9441060ca231638
Merge: 593b69c 9eaf375
Author: Brad King <***@kitware.com>
Date: Tue Apr 15 10:32:11 2014 -0400

Merge topic 'target_compile_features'

9eaf3755 Export: Populate INTERFACE_COMPILE_FEATURES property.
8ed59fc2 Add target_compile_features command.
4e6ca504 cmTargetPropCommandBase: Change the interface to return bool.
5412dede cmTarget: Transitively evaluate compiler features.
baff4434 cmTarget: Allow populating COMPILE_FEATURES using generator expressions.
f97bf437 Features: Add cxx_auto_type.
03355d6b cmTarget: Add COMPILE_FEATURES target property.
faeddf64 project: Add infrastructure for recording CXX compiler features
913394af cmTarget: Add CXX_STANDARD and CXX_EXTENSION target properties.
8238a6cd Add some COMPILE_OPTIONS for specifying C++ dialect.
892243fc Tests: Require CMake 3.0 for the SystemInformation test.
59b5fdd3 Don't load Clang-CXX from AppleClang-CXX.

commit 593b69c9dc9e692b198f1ddbf9251130e61a4679
Merge: 33358fd 941a140
Author: Brad King <***@kitware.com>
Date: Tue Apr 15 10:22:41 2014 -0400

Merge topic 'aix-no-sstream'

941a1404 AIX: fix compilation error because of missing <sstream>

--Ben
Ben Boeckel
2014-06-10 18:14:16 UTC
Permalink
Post by Ben Boeckel
Will do. Should be done in an hour or two. I'm using commits b56a9ae (before)
Attached.

--Ben
Brad King
2014-06-11 14:46:52 UTC
Permalink
Post by Ben Boeckel
Brad is going to take a deeper look at it and might have more
information in the next few days.
Here is a sscce::

cmake_minimum_required(VERSION 2.8.9)
project(ManyLibs C)
set(LibPrev)
foreach(n RANGE 100)
add_library(Lib${n} SHARED lib.c)
target_link_libraries(Lib${n} LINK_PUBLIC ${LibPrev})
set(LibPrev Lib${n})
endforeach()

On my machine:

========= ========= ========= ========
version real user sys
========= ========= ========= ========
2.8.9 0m0.428s 0m0.384s 0m0.036s
2.8.12.2 0m2.100s 0m2.016s 0m0.060s
3.0.0 0m2.856s 0m2.796s 0m0.052s
487b6ccd 0m5.232s 0m5.176s 0m0.044s
see below 0m3.450s 0m3.356s 0m0.084s
========= ========= ========= ========

The usage requirement features have added a big performance
cost even when they are not used. The transitive INTERFACE
$<TARGET_PROPERTY> lookups cause O(n^2) computation time for
link chains of length n due to the dependency on headTarget.
(For reference of others, the "headTarget" is the target
whose build rules are currently being computed, and its
dependencies may report different usage requirements based
on what is consuming them.)

I was able to lower the constant factor on the O(n^2) time
locally with a change to make GetTransitiveTargetClosure
callers:

http://cmake.org/gitweb?p=cmake.git;a=blob;f=Source/cmTarget.cxx;hb=487b6ccd#l5215
http://cmake.org/gitweb?p=cmake.git;a=blob;f=Source/cmTarget.cxx;hb=487b6ccd#l5431

receive the value returned by reference from an internal map
that memoizes the result. See attached patch series.

However, please look at improving the implementation to have
something under O(n^2) complexity when the usage requirements
do not actually depend on the headTarget.

Thanks,
-Brad
Brad King
2014-06-11 14:52:16 UTC
Permalink
Post by Brad King
I was able to lower the constant factor on the O(n^2) time
locally with a change to make GetTransitiveTargetClosure
http://cmake.org/gitweb?p=cmake.git;a=blob;f=Source/cmTarget.cxx;hb=487b6ccd#l5215
http://cmake.org/gitweb?p=cmake.git;a=blob;f=Source/cmTarget.cxx;hb=487b6ccd#l5431
receive the value returned by reference from an internal map
that memoizes the result. See attached patch series.
Actually attached this time.

-Brad
Brad King
2014-06-11 17:14:28 UTC
Permalink
Post by Brad King
However, please look at improving the implementation to have
something under O(n^2) complexity when the usage requirements
do not actually depend on the headTarget.
After looking through the related code I have a few other comments.

I see a lot of duplication of evaluating the link impl. Loops
over LinkImplementationPropertyEntries appear for each usage
requirement type. I think that can be converted to use a method
like GetLinkImplementationLibraries if we were to add backtrace
information to the cmTarget::LinkImplementation list of libraries.

Can the dagChecker used for INTERFACE_SOURCES be moved into
GetDirectLinkLibraries or ComputeLinkImplementation? Then the
loop over LinkImplementationPropertyEntries for SOURCES could
be converted too.

Why do GetLinkImplementationLibraries and GetLinkImplementation
need a headTarget argument? The LINK_LIBRARIES are private to
the implementation of a target itself. Only the link interface
should be considered when headTarget != this.

I noticed some headTarget->GetMakefile()->FindTargetToUse()
calls. This is not a safe way to look up a target name
referenced by the currentTarget because imported target
names are scoped by directory. See the attached example
for some failure cases. Read B/CMakeLists.txt comments.
I think those are triggered by the call in processILibs.
I'm having trouble producing example breakage for the
calls in GetTransitivePropertyTargets, but they look
incorrect too. FindTargetToUse always needs to be called
on the cmMakefile context of the target whose link info
references the name to be found.

Thanks,
-Brad
Ben Boeckel
2014-06-11 21:17:39 UTC
Permalink
Post by Brad King
However, please look at improving the implementation to have
something under O(n^2) complexity when the usage requirements
do not actually depend on the headTarget.
I've added a branch on stage which contains a test for cmake's big-O
order in the number of targets. I'm not merging it yet, but if anyone
wants to test with it, feel free.

linear-target-test

Please note that on a Core i7 @ 3.4 GHz the test currently takes 1100+
seconds. Feel free to remove the higher-end test cases to make the wait
easier. Unfortunately, the number of points is low, so the correlation
can be high while it looks obviously non-linear[1], so don't remove too
many.

--Ben

[1]https://bit.ly/1jl6IYt r² == 0.82
Brad King
2014-07-09 15:12:14 UTC
Permalink
Post by Brad King
cmake_minimum_required(VERSION 2.8.9)
project(ManyLibs C)
set(LibPrev)
foreach(n RANGE 100)
add_library(Lib${n} SHARED lib.c)
target_link_libraries(Lib${n} LINK_PUBLIC ${LibPrev})
set(LibPrev Lib${n})
endforeach()
A similar case also appears when building VTKWikiExamples
as discussed on the cmake user list here:

cmake 3.0 memory usage on VTKWikiExamples
http://thread.gmane.org/gmane.comp.programming.tools.cmake.user/49910/focus=49922

Memory usage explodes during generation and CMake sometimes
runs out and crashes.

-Brad
Stephen Kelly
2014-07-15 11:00:24 UTC
Permalink
Post by Brad King
Post by Brad King
cmake_minimum_required(VERSION 2.8.9)
project(ManyLibs C)
set(LibPrev)
foreach(n RANGE 100)
add_library(Lib${n} SHARED lib.c)
target_link_libraries(Lib${n} LINK_PUBLIC ${LibPrev})
set(LibPrev Lib${n})
endforeach()
A similar case also appears when building VTKWikiExamples
cmake 3.0 memory usage on VTKWikiExamples
http://thread.gmane.org/gmane.comp.programming.tools.cmake.user/49910/focus=49922
Memory usage explodes during generation and CMake sometimes
runs out and crashes.
Is this affected in any way by the recent refactoring with the same
motivation?

Thanks,

Steve.
Brad King
2014-07-15 13:47:54 UTC
Permalink
Post by Stephen Kelly
Post by Brad King
Memory usage explodes during generation and CMake sometimes
runs out and crashes.
Is this affected in any way by the recent refactoring with the same
motivation?
The refactoring so far does not help much directly, but it does
reduce some code duplication to make further improvements easier.

A small example that is a bit closer to what
happens in VTKWikiExamples is:

cmake_minimum_required(VERSION 2.8.9)
project(ManyLibs C)

set(LibPrev)
foreach(n RANGE 100)
add_library(Lib${n} SHARED lib.c)
target_link_libraries(Lib${n} LINK_PUBLIC ${LibPrev})
set(LibPrev Lib${n})
endforeach()
foreach(n RANGE 100)
add_executable(Exe${n} exe.c)
target_link_libraries(Exe${n} ${LibPrev})
endforeach()

On my machine:

========= ========= ========= ========
version real user sys
========= ========= ========= ========
2.8.9 0m1.017s 0m0.912s 0m0.096s
2.8.12.2 0m7.293s 0m7.156s 0m0.120s
3.0.0 0m10.728s 0m10.384s 0m0.128s
487b6ccd 0m20.819s 0m20.724s 0m0.092s
7bc84502 0m11.390s 0m11.296s 0m0.084s
========= ========= ========= ========

The cleanup between 487b6ccd and 7bc84502 got this example
back to 3.0-ish time but nothing like 2.8.9.

-Brad

Brad King
2014-06-09 14:05:26 UTC
Permalink
Steve,
Post by Ben Boeckel
9) Performance
Ben's concerns are more important than the following, but I also
wonder if we can reduce the startup time by combining the ABI
and Feature checks:

# Try to identify the ABI and configure it into CMakeCCompiler.cmake
include(${CMAKE_ROOT}/Modules/CMakeDetermineCompilerABI.cmake)
CMAKE_DETERMINE_COMPILER_ABI(C ${CMAKE_ROOT}/Modules/CMakeCCompilerABI.c)
# Try to identify the compiler features
include(${CMAKE_ROOT}/Modules/CMakeDetermineCompileFeatures.cmake)
CMAKE_DETERMINE_COMPILE_FEATURES(C)

into a single try-compile.

Thanks,
-Brad
Stephen Kelly
2014-06-10 16:09:04 UTC
Permalink
Post by Stephen Kelly
Here is a dump of some notes I have accumulated regarding compile features.
Any comments on the rest of this?

Thanks,

Steve.
Brad King
2014-06-10 17:27:30 UTC
Permalink
Post by Stephen Kelly
Post by Stephen Kelly
Here is a dump of some notes I have accumulated regarding compile features.
Any comments on the rest of this?
Someday perhaps ;)

My main concern beyond the performance side right now is getting the
features populated for VS >= 10. Then we will have the most popular
compilers on Linux, OS X, and Windows covered.

-Brad
Stephen Kelly
2014-06-13 09:19:21 UTC
Permalink
Post by Stephen Kelly
Here is a dump of some notes I have accumulated regarding compile features.
Just a few more:

10) WriteCompilerDetectionHeader content size

Already, with only two compilers supported, the header generated by
WriteCompilerDetectionHeader is quite large when generating for all
features.

If that is a problem (Is it?), then a solution may be to generate something
like:

#if Foo_COMPILER_IS_GNU
#include "foo_compiler_detection_gnu.hpp"
#elif Foo_COMPILER_IS_Clang
#include "foo_compiler_detection_clang.hpp"
#endif

However, that would mean requiring the user to install multiple files rather
than just one. So, it might make sense to add a new signature

write_compiler_detection_header(
DIRECTORY "${CMAKE_CURRENT_BINARY_DIR}/compiler_detection"
FILE_SUFFIX "hpp"
PREFIX Foo
COMPILERS GNU Clang AppleClang MSVC Intel XL Cray HP SunPro
FEATURES
${cxx_known_features}
)

so users can do

install((DIRECTORY "${CMAKE_CURRENT_SOURCE_DIR}/compiler_detection/"
DESTINATION include
)

There are still use-cases for the single-file signature. For example, for a
project composed of multiple libraries (like Boost or KF5), each individual
library may care about only 2 or 3 compile features and each would generate
a header for only those that it needs. This doesn't actually arise for KF5
because compile feature detection is provided by Qt. For Boost, each library
depends on the Boost.Config library, so wouldn't use this either. But I
think keeping the use-case still makes sense.


11) WriteCompilerDetectionHeader vs GenerateExportHeader

Related to the 'universal interface for features' issue, it would be
possible to define features such as cmake_c{,xx}_hidden_visibility to
generate content similar to what GenerateExportHeader creates. That would
make GenerateExportHeader obsolete for compilers supported by
WriteCompilerDetectionHeader and the <LANG>_VISIBILITY_PRESET and
VISIBILITY_INLINES_HIDDEN target properties.

This would require a new signature like

write_compiler_detection_header(
FILE mytarget_compiler_detection.h
TARGET mytarget
COMPILERS GNU Clang
FEATURES
cmake_cxx_hidden_visibility
cxx_attribute_deprecated
)

and

write_compiler_detection_header(
DIRECTORY "${CMAKE_CURRENT_BINARY_DIR}/compiler_detection"
TARGET mytarget
COMPILERS GNU Clang
FEATURES
cmake_cxx_hidden_visibility
cxx_attribute_deprecated
)

The <target> is needed in order to know what the value of the DEFINE_SYMBOL
target property is.


12) Platform-specific defines

We could consider adding defines such as PLATFORM_IS_UNIX,
PLATFORM_IS_WINDOWS etc, which CMake already knows how to detect. There are
other things which would be possible to detect too, such as architecture, OS
etc.

This is what Boost.Predef offers, but possibly not with the same names as
CMake (I didn't check).

https://github.com/boostorg/predef/tree/master/include/boost/predef


Thanks,

Steve.
Brad King
2014-06-13 13:39:15 UTC
Permalink
Post by Stephen Kelly
However, that would mean requiring the user to install multiple files rather
than just one. So, it might make sense to add a new signature
write_compiler_detection_header(
DIRECTORY "${CMAKE_CURRENT_BINARY_DIR}/compiler_detection"
Installing directories can lead to leftover cruft in incremental builds.
The caller should pass in a variable to receive the list of files to
be installed.
Post by Stephen Kelly
11) WriteCompilerDetectionHeader vs GenerateExportHeader
IMO these two modules are solving orthogonal problems and should not
be mixed.
Post by Stephen Kelly
12) Platform-specific defines
Plenty of libraries already provide things like this. Not everyone
agrees what "UNIX" or even "Linux" means. I was hesitant to accept
WriteCompilerDetectionHeader in the first place because I never
wanted to get in the business of providing a C++ SDK. It is
reasonable to re-use all the C++ compiler version and feature
info we already have, but I don't think we should provide more.

-Brad
Stephen Kelly
2014-06-15 20:24:16 UTC
Permalink
Post by Brad King
Post by Stephen Kelly
11) WriteCompilerDetectionHeader vs GenerateExportHeader
IMO these two modules are solving orthogonal problems and should not
be mixed.
I'm not sure I agree.

GenerateExportHeader needs to know about deprecation in order to generate
FOO_EXPORT_DEPRECATED and similar macros. So, they're not fully orthogonal.
Post by Brad King
Post by Stephen Kelly
12) Platform-specific defines
Plenty of libraries already provide things like this. Not everyone
agrees what "UNIX" or even "Linux" means. I was hesitant to accept
WriteCompilerDetectionHeader in the first place because I never
wanted to get in the business of providing a C++ SDK. It is
reasonable to re-use all the C++ compiler version and feature
info we already have, but I don't think we should provide more.
Sounds good. Anyone wanting more can be pointed to boost::predef.

Thanks,

Steve.
Daniel Pfeifer
2014-06-16 08:33:36 UTC
Permalink
Post by Stephen Kelly
Post by Brad King
Post by Stephen Kelly
11) WriteCompilerDetectionHeader vs GenerateExportHeader
IMO these two modules are solving orthogonal problems and should not
be mixed.
I'm not sure I agree.
GenerateExportHeader needs to know about deprecation in order to generate
FOO_EXPORT_DEPRECATED and similar macros. So, they're not fully orthogonal.
Loading...