David Nadlinger via Digitalmars-d
2014-06-27 01:31:14 UTC
Hi all,
right now, the use of std.math over core.stdc.math can cause a
huge performance problem in typical floating point graphics code.
An instance of this has recently been discussed here in the
"Perlin noise benchmark speed" thread [1], where even LDC, which
already beat DMD by a factor of two, generated code more than
twice as slow as that by Clang and GCC. Here, the use of floor()
causes trouble. [2]
Besides the somewhat slow pure D implementations in std.math, the
biggest problem is the fact that std.math almost exclusively uses
reals in its API. When working with single- or double-precision
floating point numbers, this is not only more data to shuffle
around than necessary, but on x86_64 requires the caller to
transfer the arguments from the SSE registers onto the x87 stack
and then convert the result back again. Needless to say, this is
a serious performance hazard. In fact, this accounts for an 1.9x
slowdown in the above benchmark with LDC.
Because of this, I propose to add float and double overloads (at
the very least the double ones) for all of the commonly used
functions in std.math. This is unlikely to break much code, but:
a) Somebody could rely on the fact that the calls effectively
widen the calculation to 80 bits on x86 when using type deduction.
b) Additional overloads make e.g. "&floor" ambiguous without
context, of course.
What do you think?
Cheers,
David
[1] http://forum.dlang.org/thread/lo19l7$n2a$1 at digitalmars.com
[2] Fun fact: As the program happens only deal with positive
numbers, the author could have just inserted an int-to-float
cast, sidestepping the issue altogether. All the other language
implementations have the floor() call too, though, so it doesn't
matter for this discussion.
right now, the use of std.math over core.stdc.math can cause a
huge performance problem in typical floating point graphics code.
An instance of this has recently been discussed here in the
"Perlin noise benchmark speed" thread [1], where even LDC, which
already beat DMD by a factor of two, generated code more than
twice as slow as that by Clang and GCC. Here, the use of floor()
causes trouble. [2]
Besides the somewhat slow pure D implementations in std.math, the
biggest problem is the fact that std.math almost exclusively uses
reals in its API. When working with single- or double-precision
floating point numbers, this is not only more data to shuffle
around than necessary, but on x86_64 requires the caller to
transfer the arguments from the SSE registers onto the x87 stack
and then convert the result back again. Needless to say, this is
a serious performance hazard. In fact, this accounts for an 1.9x
slowdown in the above benchmark with LDC.
Because of this, I propose to add float and double overloads (at
the very least the double ones) for all of the commonly used
functions in std.math. This is unlikely to break much code, but:
a) Somebody could rely on the fact that the calls effectively
widen the calculation to 80 bits on x86 when using type deduction.
b) Additional overloads make e.g. "&floor" ambiguous without
context, of course.
What do you think?
Cheers,
David
[1] http://forum.dlang.org/thread/lo19l7$n2a$1 at digitalmars.com
[2] Fun fact: As the program happens only deal with positive
numbers, the author could have just inserted an int-to-float
cast, sidestepping the issue altogether. All the other language
implementations have the floor() call too, though, so it doesn't
matter for this discussion.