std.math performance (SSE vs. real)

Post by David Nadlinger via Digitalmars-d
Hi all,
right now, the use of std.math over core.stdc.math can cause a
huge performance problem in typical floating point graphics
code. An instance of this has recently been discussed here in
the "Perlin noise benchmark speed" thread [1], where even LDC,
which already beat DMD by a factor of two, generated code more
than twice as slow as that by Clang and GCC. Here, the use of
floor() causes trouble. [2]
Besides the somewhat slow pure D implementations in std.math,
the biggest problem is the fact that std.math almost
exclusively uses reals in its API. When working with single- or
double-precision floating point numbers, this is not only more
data to shuffle around than necessary, but on x86_64 requires
the caller to transfer the arguments from the SSE registers
onto the x87 stack and then convert the result back again.
Needless to say, this is a serious performance hazard. In fact,
this accounts for an 1.9x slowdown in the above benchmark with
LDC.
Because of this, I propose to add float and double overloads
(at the very least the double ones) for all of the commonly
used functions in std.math. This is unlikely to break much
a) Somebody could rely on the fact that the calls effectively
widen the calculation to 80 bits on x86 when using type
deduction.
b) Additional overloads make e.g. "&floor" ambiguous without
context, of course.
What do you think?
Cheers,
David
[1] http://forum.dlang.org/thread/lo19l7$n2a$1 at digitalmars.com
[2] Fun fact: As the program happens only deal with positive
numbers, the author could have just inserted an int-to-float
cast, sidestepping the issue altogether. All the other language
implementations have the floor() call too, though, so it
doesn't matter for this discussion.

I honestly alway thought that it was a little odd that it forced
conversion to real. Personally I support this. It would also make
generic code that calls math functions more simple as it wouldn't
require casts back.

H. S. Teoh via Digitalmars-d

2014-06-27 02:14:38 UTC

Post by Tofu Ninja via Digitalmars-d

[...]

Because of this, I propose to add float and double overloads (at the
very least the double ones) for all of the commonly used functions in
a) Somebody could rely on the fact that the calls effectively widen
the calculation to 80 bits on x86 when using type deduction.
b) Additional overloads make e.g. "&floor" ambiguous without
context, of course.
What do you think?

[...]

Post by Tofu Ninja via Digitalmars-d
I honestly alway thought that it was a little odd that it forced
conversion to real. Personally I support this. It would also make
generic code that calls math functions more simple as it wouldn't
require casts back.

I support this too.

T

--
It is impossible to make anything foolproof because fools are so ingenious. -- Sammy

Jerry via Digitalmars-d

2014-06-27 03:28:22 UTC

Post by Tofu Ninja via Digitalmars-d

[...]

I support this too.

Me three. This seems like an unnecessary pessimisation and it would be
irritating for D to become associated with slow fp math.

Russel Winder via Digitalmars-d

2014-06-27 05:26:30 UTC

Post by Jerry via Digitalmars-d

Post by Tofu Ninja via Digitalmars-d

[...]

I support this too.

Me three. This seems like an unnecessary pessimisation and it would be
irritating for D to become associated with slow fp math.

So has anyone got a pull request ready?
--
Russel.
=============================================================================
Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder at ekiga.net
41 Buckmaster Road m: +44 7770 465 077 xmpp: russel at winder.org.uk
London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: This is a digitally signed message part
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20140627/ad36f958/attachment.sig>

Iain Buclaw via Digitalmars-d

2014-06-27 06:14:34 UTC

On 27 June 2014 02:31, David Nadlinger via Digitalmars-d

Post by David Nadlinger via Digitalmars-d
Hi all,
right now, the use of std.math over core.stdc.math can cause a huge
performance problem in typical floating point graphics code. An instance of
this has recently been discussed here in the "Perlin noise benchmark speed"
thread [1], where even LDC, which already beat DMD by a factor of two,
generated code more than twice as slow as that by Clang and GCC. Here, the
use of floor() causes trouble. [2]
Besides the somewhat slow pure D implementations in std.math, the biggest
problem is the fact that std.math almost exclusively uses reals in its API.
When working with single- or double-precision floating point numbers, this
is not only more data to shuffle around than necessary, but on x86_64
requires the caller to transfer the arguments from the SSE registers onto
the x87 stack and then convert the result back again. Needless to say, this
is a serious performance hazard. In fact, this accounts for an 1.9x slowdown
in the above benchmark with LDC.
Because of this, I propose to add float and double overloads (at the very
least the double ones) for all of the commonly used functions in std.math.
a) Somebody could rely on the fact that the calls effectively widen the
calculation to 80 bits on x86 when using type deduction.
b) Additional overloads make e.g. "&floor" ambiguous without context, of
course.
What do you think?
Cheers,
David

This is the reason why floor is slow, it has an array copy operation.

---
auto vu = *cast(ushort[real.sizeof/2]*)(&x);
---

I didn't like it at the time I wrote, but at least it prevented the
compiler (gdc) from removing all bit operations that followed.

If there is an alternative to the above, then I'd imagine that would
speed up floor by tenfold.

Regards
Iain

Iain Buclaw via Digitalmars-d

2014-06-27 06:48:34 UTC

Post by Iain Buclaw via Digitalmars-d
On 27 June 2014 02:31, David Nadlinger via Digitalmars-d

Post by David Nadlinger via Digitalmars-d
Hi all,
right now, the use of std.math over core.stdc.math can cause a huge
performance problem in typical floating point graphics code. An instance of
this has recently been discussed here in the "Perlin noise benchmark speed"
thread [1], where even LDC, which already beat DMD by a factor of two,
generated code more than twice as slow as that by Clang and GCC. Here, the
use of floor() causes trouble. [2]
Besides the somewhat slow pure D implementations in std.math, the biggest
problem is the fact that std.math almost exclusively uses reals in its API.
When working with single- or double-precision floating point numbers, this
is not only more data to shuffle around than necessary, but on x86_64
requires the caller to transfer the arguments from the SSE registers onto
the x87 stack and then convert the result back again. Needless to say, this
is a serious performance hazard. In fact, this accounts for an 1.9x slowdown
in the above benchmark with LDC.
Because of this, I propose to add float and double overloads (at the very
least the double ones) for all of the commonly used functions in std.math.
a) Somebody could rely on the fact that the calls effectively widen the
calculation to 80 bits on x86 when using type deduction.
b) Additional overloads make e.g. "&floor" ambiguous without context, of
course.
What do you think?
Cheers,
David

This is the reason why floor is slow, it has an array copy operation.
---
auto vu = *cast(ushort[real.sizeof/2]*)(&x);
---
I didn't like it at the time I wrote, but at least it prevented the
compiler (gdc) from removing all bit operations that followed.
If there is an alternative to the above, then I'd imagine that would
speed up floor by tenfold.

Can you test with this?

https://github.com/D-Programming-Language/phobos/pull/2274

Float and Double implementations of floor/ceil are trivial and I can add later.

Iain Buclaw via Digitalmars-d

2014-06-27 14:26:04 UTC

Post by Iain Buclaw via Digitalmars-d
On 27 June 2014 02:31, David Nadlinger via Digitalmars-d

Post by David Nadlinger via Digitalmars-d
Hi all,
right now, the use of std.math over core.stdc.math can cause a huge
performance problem in typical floating point graphics code. An instance of
this has recently been discussed here in the "Perlin noise benchmark speed"
thread [1], where even LDC, which already beat DMD by a factor of two,
generated code more than twice as slow as that by Clang and GCC. Here, the
use of floor() causes trouble. [2]
Besides the somewhat slow pure D implementations in std.math, the biggest
problem is the fact that std.math almost exclusively uses reals in its API.
When working with single- or double-precision floating point numbers, this
is not only more data to shuffle around than necessary, but on x86_64
requires the caller to transfer the arguments from the SSE registers onto
the x87 stack and then convert the result back again. Needless to say, this
is a serious performance hazard. In fact, this accounts for an 1.9x slowdown
in the above benchmark with LDC.
Because of this, I propose to add float and double overloads (at the very
least the double ones) for all of the commonly used functions in std.math.
a) Somebody could rely on the fact that the calls effectively widen the
calculation to 80 bits on x86 when using type deduction.
b) Additional overloads make e.g. "&floor" ambiguous without context, of
course.
What do you think?
Cheers,
David

This is the reason why floor is slow, it has an array copy operation.
---
auto vu = *cast(ushort[real.sizeof/2]*)(&x);
---
I didn't like it at the time I wrote, but at least it prevented the
compiler (gdc) from removing all bit operations that followed.
If there is an alternative to the above, then I'd imagine that would
speed up floor by tenfold.

Can you test with this?
https://github.com/D-Programming-Language/phobos/pull/2274
Float and Double implementations of floor/ceil are trivial and I can add later.

Added float/double implementations.

hane via Digitalmars-d

2014-06-27 09:37:53 UTC

On Friday, 27 June 2014 at 06:48:44 UTC, Iain Buclaw via

On 27 June 2014 07:14, Iain Buclaw <ibuclaw at gdcproject.org>

Post by Iain Buclaw via Digitalmars-d
On 27 June 2014 02:31, David Nadlinger via Digitalmars-d

Post by David Nadlinger via Digitalmars-d
Hi all,
right now, the use of std.math over core.stdc.math can cause
a huge
performance problem in typical floating point graphics code.
An instance of
this has recently been discussed here in the "Perlin noise
benchmark speed"
thread [1], where even LDC, which already beat DMD by a
factor of two,
generated code more than twice as slow as that by Clang and
GCC. Here, the
use of floor() causes trouble. [2]
Besides the somewhat slow pure D implementations in std.math, the biggest
problem is the fact that std.math almost exclusively uses
reals in its API.
When working with single- or double-precision floating point
numbers, this
is not only more data to shuffle around than necessary, but
on x86_64
requires the caller to transfer the arguments from the SSE
registers onto
the x87 stack and then convert the result back again.
Needless to say, this
is a serious performance hazard. In fact, this accounts for
an 1.9x slowdown
in the above benchmark with LDC.
Because of this, I propose to add float and double overloads
(at the very
least the double ones) for all of the commonly used functions in std.math.
a) Somebody could rely on the fact that the calls
effectively widen the
calculation to 80 bits on x86 when using type deduction.
b) Additional overloads make e.g. "&floor" ambiguous without context, of
course.
What do you think?
Cheers,
David

This is the reason why floor is slow, it has an array copy
operation.
---
auto vu = *cast(ushort[real.sizeof/2]*)(&x);
---
I didn't like it at the time I wrote, but at least it
prevented the
compiler (gdc) from removing all bit operations that followed.
If there is an alternative to the above, then I'd imagine that would
speed up floor by tenfold.

Can you test with this?
https://github.com/D-Programming-Language/phobos/pull/2274
Float and Double implementations of floor/ceil are trivial and
I can add later.

Nice! I tested with the Perlin noise benchmark, and it got
faster(in my environment, 1.030s -> 0.848s).
But floor still consumes almost half of the execution time.

David Nadlinger via Digitalmars-d

2014-06-27 10:47:42 UTC

Post by hane via Digitalmars-d
On Friday, 27 June 2014 at 06:48:44 UTC, Iain Buclaw via

Post by Iain Buclaw via Digitalmars-d
Can you test with this?
https://github.com/D-Programming-Language/phobos/pull/2274
Float and Double implementations of floor/ceil are trivial and
I can add later.

Nice! I tested with the Perlin noise benchmark, and it got
faster(in my environment, 1.030s -> 0.848s).
But floor still consumes almost half of the execution time.

Wait, so DMD and GDC did actually emit a memcpy/âŠ here? LDC
doesn't, and the change didn't have much of an impact on
performance.

What _does_ have a significant impact, however, is that the whole
of floor() for doubles can be optimized down to
roundsd <âŠ>,<âŠ>,0x1
when targeting SSE 4.1, or
vroundsd <âŠ>,<âŠ>,<âŠ>,0x1
when targeting AVX.

This is why std.math will need to build on top of
compiler-recognizable primitives. Iain, Don, how do you think we
should handle this? One option would be to build std.math based
on an extended core.math with functions that are recognized as
intrinsics or suitably implemented in the compiler-specific
runtimes. The other option would be for me to submit LDC-specific
implementations to Phobos.

Cheers,
David

Iain Buclaw via Digitalmars-d

2014-06-27 12:20:54 UTC

On 27 June 2014 11:47, David Nadlinger via Digitalmars-d

On Friday, 27 June 2014 at 06:48:44 UTC, Iain Buclaw via Digitalmars-d

Nice! I tested with the Perlin noise benchmark, and it got faster(in my
environment, 1.030s -> 0.848s).
But floor still consumes almost half of the execution time.

Wait, so DMD and GDC did actually emit a memcpy/âŠ here? LDC doesn't, and the
change didn't have much of an impact on performance.

Yes, IIRC _d_arraycopy to be exact (so we loose doubly so!)

What _does_ have a significant impact, however, is that the whole of floor()
for doubles can be optimized down to
roundsd <âŠ>,<âŠ>,0x1
when targeting SSE 4.1, or
vroundsd <âŠ>,<âŠ>,<âŠ>,0x1
when targeting AVX.
This is why std.math will need to build on top of compiler-recognizable
primitives. Iain, Don, how do you think we should handle this?

My opinion is that we should have never have pushed a variable sized
as the baseline for all floating point computations in the first
place.

But as we can't backtrace now, overloads will just have to do. I
would welcome a DIP to add new core.math intrinsics that could be
proven to be useful for the sake of maintainability (and portability).

Regards
Iain

David Nadlinger via Digitalmars-d

2014-06-27 10:50:24 UTC

Post by hane via Digitalmars-d
Nice! I tested with the Perlin noise benchmark, and it got
faster(in my environment, 1.030s -> 0.848s).
But floor still consumes almost half of the execution time.

Oh, and by the way, my optimized version (simply replace floor()
in perlin_noise.d with a call to llvm_floor() from
ldc.intrinsics) is 2.8x faster than the original one on my
machine (both with -mcpu=native).

David

Iain Buclaw via Digitalmars-d

2014-06-28 11:30:45 UTC

On 27 June 2014 10:37, hane via Digitalmars-d

On Friday, 27 June 2014 at 06:48:44 UTC, Iain Buclaw via Digitalmars-d

Post by Iain Buclaw via Digitalmars-d
On 27 June 2014 02:31, David Nadlinger via Digitalmars-d

Post by David Nadlinger via Digitalmars-d
Hi all,
right now, the use of std.math over core.stdc.math can cause a huge
performance problem in typical floating point graphics code. An instance of
this has recently been discussed here in the "Perlin noise benchmark speed"
thread [1], where even LDC, which already beat DMD by a factor of two,
generated code more than twice as slow as that by Clang and GCC. Here, the
use of floor() causes trouble. [2]
Besides the somewhat slow pure D implementations in std.math, the biggest
problem is the fact that std.math almost exclusively uses reals in its API.
When working with single- or double-precision floating point numbers, this
is not only more data to shuffle around than necessary, but on x86_64
requires the caller to transfer the arguments from the SSE registers onto
the x87 stack and then convert the result back again. Needless to say, this
is a serious performance hazard. In fact, this accounts for an 1.9x slowdown
in the above benchmark with LDC.
Because of this, I propose to add float and double overloads (at the very
least the double ones) for all of the commonly used functions in std.math.
a) Somebody could rely on the fact that the calls effectively widen the
calculation to 80 bits on x86 when using type deduction.
b) Additional overloads make e.g. "&floor" ambiguous without context, of
course.
What do you think?
Cheers,
David

This is the reason why floor is slow, it has an array copy operation.
---
auto vu = *cast(ushort[real.sizeof/2]*)(&x);
---
I didn't like it at the time I wrote, but at least it prevented the
compiler (gdc) from removing all bit operations that followed.
If there is an alternative to the above, then I'd imagine that would
speed up floor by tenfold.

Can you test with this?
https://github.com/D-Programming-Language/phobos/pull/2274
Float and Double implementations of floor/ceil are trivial and I can add
later.

Nice! I tested with the Perlin noise benchmark, and it got faster(in my
environment, 1.030s -> 0.848s).
But floor still consumes almost half of the execution time.

I've done some further improvements in that PR. I'd imagine you'd see
a little more juice squeezed out.

Manu via Digitalmars-d

2014-06-27 10:50:52 UTC

On 27 June 2014 11:31, David Nadlinger via Digitalmars-d

Post by David Nadlinger via Digitalmars-d
Hi all,
right now, the use of std.math over core.stdc.math can cause a huge
performance problem in typical floating point graphics code. An instance of
this has recently been discussed here in the "Perlin noise benchmark speed"
thread [1], where even LDC, which already beat DMD by a factor of two,
generated code more than twice as slow as that by Clang and GCC. Here, the
use of floor() causes trouble. [2]
Besides the somewhat slow pure D implementations in std.math, the biggest
problem is the fact that std.math almost exclusively uses reals in its API.
When working with single- or double-precision floating point numbers, this
is not only more data to shuffle around than necessary, but on x86_64
requires the caller to transfer the arguments from the SSE registers onto
the x87 stack and then convert the result back again. Needless to say, this
is a serious performance hazard. In fact, this accounts for an 1.9x slowdown
in the above benchmark with LDC.
Because of this, I propose to add float and double overloads (at the very
least the double ones) for all of the commonly used functions in std.math.
a) Somebody could rely on the fact that the calls effectively widen the
calculation to 80 bits on x86 when using type deduction.
b) Additional overloads make e.g. "&floor" ambiguous without context, of
course.
What do you think?
Cheers,
David
[1] http://forum.dlang.org/thread/lo19l7$n2a$1 at digitalmars.com
[2] Fun fact: As the program happens only deal with positive numbers, the
author could have just inserted an int-to-float cast, sidestepping the issue
altogether. All the other language implementations have the floor() call
too, though, so it doesn't matter for this discussion.

Totally agree.
Maintaining commitment to deprecated hardware which could be removed
from the silicone at any time is a bit of a problem looking forwards.
Regardless of the decision about whether overloads are created, at
very least, I'd suggest x64 should define real as double, since the
x87 is deprecated, and x64 ABI uses the SSE unit. It makes no sense at
all to use real under any general circumstances in x64 builds.

And aside from that, if you *think* you need real for precision, the
truth is, you probably have bigger problems.
Double already has massive precision. I find it's extremely rare to
have precision problems even with float under most normal usage
circumstances, assuming you are conscious of the relative magnitudes
of your terms.

John Colvin via Digitalmars-d

2014-06-27 11:10:56 UTC

On Friday, 27 June 2014 at 10:51:05 UTC, Manu via Digitalmars-d

Post by Manu via Digitalmars-d
On 27 June 2014 11:31, David Nadlinger via Digitalmars-d

Post by David Nadlinger via Digitalmars-d
Hi all,
right now, the use of std.math over core.stdc.math can cause a
huge
performance problem in typical floating point graphics code.
An instance of
this has recently been discussed here in the "Perlin noise
benchmark speed"
thread [1], where even LDC, which already beat DMD by a factor
of two,
generated code more than twice as slow as that by Clang and
GCC. Here, the
use of floor() causes trouble. [2]
Besides the somewhat slow pure D implementations in std.math,
the biggest
problem is the fact that std.math almost exclusively uses
reals in its API.
When working with single- or double-precision floating point
numbers, this
is not only more data to shuffle around than necessary, but on
x86_64
requires the caller to transfer the arguments from the SSE
registers onto
the x87 stack and then convert the result back again. Needless
to say, this
is a serious performance hazard. In fact, this accounts for an
1.9x slowdown
in the above benchmark with LDC.
Because of this, I propose to add float and double overloads
(at the very
least the double ones) for all of the commonly used functions
in std.math.
a) Somebody could rely on the fact that the calls effectively
widen the
calculation to 80 bits on x86 when using type deduction.
b) Additional overloads make e.g. "&floor" ambiguous without
context, of
course.
What do you think?
Cheers,
David
[1] http://forum.dlang.org/thread/lo19l7$n2a$1 at digitalmars.com
[2] Fun fact: As the program happens only deal with positive
numbers, the
author could have just inserted an int-to-float cast,
sidestepping the issue
altogether. All the other language implementations have the
floor() call
too, though, so it doesn't matter for this discussion.

Totally agree.
Maintaining commitment to deprecated hardware which could be
removed
from the silicone at any time is a bit of a problem looking
forwards.
Regardless of the decision about whether overloads are created,
at
very least, I'd suggest x64 should define real as double, since
the
x87 is deprecated, and x64 ABI uses the SSE unit. It makes no
sense at
all to use real under any general circumstances in x64 builds.
And aside from that, if you *think* you need real for
precision, the
truth is, you probably have bigger problems.
Double already has massive precision. I find it's extremely
rare to
have precision problems even with float under most normal usage
circumstances, assuming you are conscious of the relative
magnitudes
of your terms.

I think real should stay how it is, as the largest
hardware-supported floating point type on a system. What needs to
change is dmd and phobos' default usage of real. Double should be
the standard. People should be able to reach for real if they
really need it, but normal D code should target the sweet spot
that is double*.

I understand why the current situation exists. In 2000 x87 was
the standard and the 80bit precision came for free.

*The number of algorithms that are both numerically
stable/correct and benefit significantly from > 64bit doubles is
very small. The same can't be said for 32bit floats.

Remo via Digitalmars-d

2014-06-27 11:29:36 UTC

Post by John Colvin via Digitalmars-d
On Friday, 27 June 2014 at 10:51:05 UTC, Manu via Digitalmars-d

Post by Manu via Digitalmars-d
On 27 June 2014 11:31, David Nadlinger via Digitalmars-d

Post by David Nadlinger via Digitalmars-d
Hi all,
right now, the use of std.math over core.stdc.math can cause
a huge
performance problem in typical floating point graphics code.
An instance of
this has recently been discussed here in the "Perlin noise
benchmark speed"
thread [1], where even LDC, which already beat DMD by a
factor of two,
generated code more than twice as slow as that by Clang and
GCC. Here, the
use of floor() causes trouble. [2]
Besides the somewhat slow pure D implementations in std.math,
the biggest
problem is the fact that std.math almost exclusively uses
reals in its API.
When working with single- or double-precision floating point
numbers, this
is not only more data to shuffle around than necessary, but
on x86_64
requires the caller to transfer the arguments from the SSE
registers onto
the x87 stack and then convert the result back again.
Needless to say, this
is a serious performance hazard. In fact, this accounts for
an 1.9x slowdown
in the above benchmark with LDC.
Because of this, I propose to add float and double overloads
(at the very
least the double ones) for all of the commonly used functions
in std.math.
a) Somebody could rely on the fact that the calls effectively
widen the
calculation to 80 bits on x86 when using type deduction.
b) Additional overloads make e.g. "&floor" ambiguous without
context, of
course.
What do you think?
Cheers,
David
[1] http://forum.dlang.org/thread/lo19l7$n2a$1 at digitalmars.com
[2] Fun fact: As the program happens only deal with positive
numbers, the
author could have just inserted an int-to-float cast,
sidestepping the issue
altogether. All the other language implementations have the
floor() call
too, though, so it doesn't matter for this discussion.

Totally agree.
Maintaining commitment to deprecated hardware which could be
removed
from the silicone at any time is a bit of a problem looking
forwards.
Regardless of the decision about whether overloads are
created, at
very least, I'd suggest x64 should define real as double,
since the
x87 is deprecated, and x64 ABI uses the SSE unit. It makes no
sense at
all to use real under any general circumstances in x64 builds.
And aside from that, if you *think* you need real for
precision, the
truth is, you probably have bigger problems.
Double already has massive precision. I find it's extremely
rare to
have precision problems even with float under most normal usage
circumstances, assuming you are conscious of the relative
magnitudes
of your terms.

I think real should stay how it is, as the largest
hardware-supported floating point type on a system. What needs
to change is dmd and phobos' default usage of real. Double
should be the standard. People should be able to reach for real
if they really need it, but normal D code should target the
sweet spot that is double*.
I understand why the current situation exists. In 2000 x87 was
the standard and the 80bit precision came for free.
*The number of algorithms that are both numerically
stable/correct and benefit significantly from > 64bit doubles
is very small. The same can't be said for 32bit floats.

Totally agree!
Please add float and double overloads and make double default.
Sometimes float is just enough, but in most times double should
be used.

If some one need more precision as double can provide then 80bit
will probably be not enough any way.

IMHO intrinsics should be used as default if possible.

Russel Winder via Digitalmars-d

2014-06-27 12:20:04 UTC

On Fri, 2014-06-27 at 11:10 +0000, John Colvin via Digitalmars-d wrote:
[âŠ]

Post by John Colvin via Digitalmars-d
I understand why the current situation exists. In 2000 x87 was
the standard and the 80bit precision came for free.

Real programmers have been using 128-bit floating point for decades. All
this namby-pamby 80-bit stuff is just an aberration and should never
have happened.

[âŠ]

--
Russel.
=============================================================================
Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder at ekiga.net
41 Buckmaster Road m: +44 7770 465 077 xmpp: russel at winder.org.uk
London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder

dennis luehring via Digitalmars-d

2014-06-27 13:04:34 UTC

[Ã¢ÂÅ ]

Post by John Colvin via Digitalmars-d
I understand why the current situation exists. In 2000 x87 was
the standard and the 80bit precision came for free.

Real programmers have been using 128-bit floating point for decades. All
this namby-pamby 80-bit stuff is just an aberration and should never
have happened.

what consumer hardware and compiler supports 128-bit floating points?

John Colvin via Digitalmars-d

2014-06-27 13:11:18 UTC

Post by Russel Winder via Digitalmars-d
On Fri, 2014-06-27 at 11:10 +0000, John Colvin via
[Ã¢ÂÅ ]

Post by John Colvin via Digitalmars-d
I understand why the current situation exists. In 2000 x87 was
the standard and the 80bit precision came for free.

Real programmers have been using 128-bit floating point for
decades. All
this namby-pamby 80-bit stuff is just an aberration and should never
have happened.

what consumer hardware and compiler supports 128-bit floating
points?

I think he was joking :)

No consumer hardware supports IEEE binary128 as far as I know.
Wikipedia suggests that Sparc used to have some support.

Russel Winder via Digitalmars-d

2014-06-28 11:03:17 UTC

Post by Russel Winder via Digitalmars-d
On Fri, 2014-06-27 at 11:10 +0000, John Colvin via
[Ã¢ÂÅ ]

Post by John Colvin via Digitalmars-d
I understand why the current situation exists. In 2000 x87 was
the standard and the 80bit precision came for free.

Real programmers have been using 128-bit floating point for
decades. All
this namby-pamby 80-bit stuff is just an aberration and should never
have happened.

what consumer hardware and compiler supports 128-bit floating points?

I think he was joking :)

Actually no, butâŠ

Post by John Colvin via Digitalmars-d
No consumer hardware supports IEEE binary128 as far as I know.
Wikipedia suggests that Sparc used to have some support.

For once Wikipedia is not wrong. IBM 128-bit is not IEEE compliant (but
pre-dates IEEE standards). SPARC is IEEE compliant. No other hardware
manufacturer appears to care about accuracy of floating point expression
evaluation. GPU manufacturers have an excuse of sorts in that speed is
more important than accuracy for graphics model evaluation. GPGPU
suffers because of this.
--
Russel.
=============================================================================
Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder at ekiga.net
41 Buckmaster Road m: +44 7770 465 077 xmpp: russel at winder.org.uk
London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: This is a digitally signed message part
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20140628/951f2047/attachment.sig>

Element 126 via Digitalmars-d

2014-06-27 13:24:22 UTC

[Ã¢ÂÅ ]

Post by John Colvin via Digitalmars-d
I understand why the current situation exists. In 2000 x87 was
the standard and the 80bit precision came for free.

Real programmers have been using 128-bit floating point for decades. All
this namby-pamby 80-bit stuff is just an aberration and should never
have happened.

what consumer hardware and compiler supports 128-bit floating points?

I noticed that std.math mentions partial support for big endian non-IEEE
doubledouble. I first thought that it was a software implemetation like
the QD library [1][2][3], but I could not find how to use it on x86_64.
It looks like it is only available for the PowerPC architecture.
Does anyone know about it ?

[1] http://crd-legacy.lbl.gov/~dhbailey/mpdist/
[2]
http://web.mit.edu/tabbott/Public/quaddouble-debian/qd-2.3.4-old/docs/qd.pdf
[3] www.davidhbailey.com/dhbpapers/quad-double.pdf

Iain Buclaw via Digitalmars-d

2014-06-27 13:50:17 UTC

Post by Element 126 via Digitalmars-d

On 27 June 2014 14:24, Element 126 via Digitalmars-d

[Ã¢â¬Å ]

Post by John Colvin via Digitalmars-d
I understand why the current situation exists. In 2000 x87 was
the standard and the 80bit precision came for free.

Real programmers have been using 128-bit floating point for decades. All
this namby-pamby 80-bit stuff is just an aberration and should never
have happened.

what consumer hardware and compiler supports 128-bit floating points?

We only support native types in std.math. And partial support is
saying more than what there actually is. :-)

Kai Nacke via Digitalmars-d

2014-06-27 14:50:12 UTC

Post by Element 126 via Digitalmars-d

On Friday, 27 June 2014 at 13:50:29 UTC, Iain Buclaw via

Post by Iain Buclaw via Digitalmars-d
On 27 June 2014 14:24, Element 126 via Digitalmars-d

Post by Russel Winder via Digitalmars-d
On Fri, 2014-06-27 at 11:10 +0000, John Colvin via
[Ã¢â¬Å ]

Post by John Colvin via Digitalmars-d
I understand why the current situation exists. In 2000 x87
was
the standard and the 80bit precision came for free.

Real programmers have been using 128-bit floating point for
decades. All
this namby-pamby 80-bit stuff is just an aberration and
should never
have happened.

what consumer hardware and compiler supports 128-bit floating points?

We only support native types in std.math. And partial support
is
saying more than what there actually is. :-)

The doubledouble type is available for PowerPC. In fact, I try to
use this for my PowerPC64 port of LDC. The partial support here
is a bit annoying but I did not find the time to implement the
missing functions myself.

It is "native" in the sense that it is a supported type by gcc
and xlc.

Regards,
Kai

Kagamin via Digitalmars-d

2014-06-27 18:19:54 UTC

Post by Kai Nacke via Digitalmars-d
The doubledouble type is available for PowerPC. In fact, I try
to use this for my PowerPC64 port of LDC. The partial support
here is a bit annoying but I did not find the time to implement
the missing functions myself.
It is "native" in the sense that it is a supported type by gcc
and xlc.

Doesn't SSE2 effectively operate on double doubles too with
instructions like addpd (and others *pd)?

Element 126 via Digitalmars-d

2014-06-27 21:10:08 UTC

The doubledouble type is available for PowerPC. In fact, I try to use
this for my PowerPC64 port of LDC. The partial support here is a bit
annoying but I did not find the time to implement the missing
functions myself.
It is "native" in the sense that it is a supported type by gcc and xlc.

Doesn't SSE2 effectively operate on double doubles too with instructions
like addpd (and others *pd)?

I'm everything but an assembly guru (so please correct me if I'm wrong),
but if my understanding is right, SSE2 only operates element-wise (at
least for the operations you are mentionning).
For instance, if you operate on two "double2" vectors (in pseudo-code) :
c[] = a[] # b[]
where # is a supported binary operation, then the value of the first
element of c only depends on the first elements of a and b.

The idea of double-double is that you operate on two doubles in such a
way that if you "concatenate" the mantissas of both, then you
effectively obtain the correct mathematical semantics of a quadruple
precision floating point number, with a higher number of significant
digits (~31 vs ~16 for double, in base 10).

I am not 100% sure yet, but I think that the idea is to simulate a
floating point number with a 106 bit mantissa and a 12 bit exponent as
x = s * ( m1 + m2 * 2^(-53) ) * 2^(e-b)
= s * m1 * 2^(e-b) + s * m2 * 2^(e-b-53)
where s is the sign bit (the same for both doubles), m1 and m2 the
mantissas (including the implied 1 for normalized numbers), e the base-2
exponent, b the common bias and 53 an extra bias for the low-order bits
(I'm ignoring the denormalized numbers and the special values). The
mantissa m1 of the first double gives the first 53 significant bits, and
this of the second (m2) the extra 53 bits.

The addition is quite straightforward, but it gets tricky when
implementing the other operations. The articles I mentionned in my
previous post describe these operations for "quadruple-doubles",
achieving a ~62 digit precision (implemented in the QD library, but
there is also a CUDA implemetation). It is completely overkill for most
applications, but it can be useful for studying the convergence of
numerical algorithms, and double-doubles can provide the extra precision
needed in some simulations (or to compare the results with double
precision).

It is also a comparatively faster alternative to arbitrary-precision
floating-point libraries like GMP/MPFR, since it does not need to
emulate every single digit, but instead takes advantage of the native
double precision instructions. The downside is that you cannot get more
significant bits than n*53, which is not suitable for computing the
decimals of pi for instance.

To give you more details, I will need to study these papers more
thoroughly. I am actually considering bringing double-double and
quad-double software support to D, either by making a binding to QD,
porting it or starting from scratch based on the papers. I don't know if
it will succeed but it will be an interesting exercise anyway. I don't
have a lot of time right now but I will try to start working on it in a
few weeks. I'd really like to be able to use it with D. Having to
rewrite an algorithm in C++ where I could only change one template
argument in the main() can be quite painful :-)

Russel Winder via Digitalmars-d

2014-06-28 11:00:28 UTC

On Fri, 2014-06-27 at 15:04 +0200, dennis luehring via Digitalmars-d

[Ã¢ÂÅ ]

Post by John Colvin via Digitalmars-d
I understand why the current situation exists. In 2000 x87 was
the standard and the 80bit precision came for free.

Real programmers have been using 128-bit floating point for decades. All
this namby-pamby 80-bit stuff is just an aberration and should never
have happened.

what consumer hardware and compiler supports 128-bit floating points?

None but what has that do do with the core problem being debated?

The core problem here is that no programming language has a proper type
system able to deal with hardware. C has a hack, Fortran as a less
problematic hack. Go insists on float32, float64, etc. which is better
but still not great.
--
Russel.
=============================================================================
Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder at ekiga.net
41 Buckmaster Road m: +44 7770 465 077 xmpp: russel at winder.org.uk
London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: This is a digitally signed message part
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20140628/f080e17a/attachment.sig>

Walter Bright via Digitalmars-d

2014-06-28 05:18:56 UTC

*The number of algorithms that are both numerically stable/correct and benefit
significantly from > 64bit doubles is very small.

To be blunt, baloney. I ran into these problems ALL THE TIME when doing
professional numerical work.

Walter Bright via Digitalmars-d

2014-06-28 06:16:52 UTC

*The number of algorithms that are both numerically stable/correct and benefit
significantly from > 64bit doubles is very small.

To be blunt, baloney. I ran into these problems ALL THE TIME when doing
professional numerical work.

Sorry for being so abrupt. FP is important to me - it's not just about
performance, it's also about accuracy.

John Colvin via Digitalmars-d

2014-06-28 09:07:15 UTC

Post by John Colvin via Digitalmars-d
*The number of algorithms that are both numerically
stable/correct and benefit
significantly from > 64bit doubles is very small.

To be blunt, baloney. I ran into these problems ALL THE TIME
when doing
professional numerical work.

Sorry for being so abrupt. FP is important to me - it's not
just about performance, it's also about accuracy.

I still maintain that the need for the precision of 80bit reals
is a niche demand. Its a very important niche, but it doesn't
justify having its relatively extreme requirements be the
default. Someone writing a matrix inversion has only themselves
to blame if they don't know plenty of numerical analysis and look
very carefully at the specifications of all operations they are
using.

Paying the cost of moving to/from the fpu, missing out on
increasingly large SIMD units, these make everyone pay the price.

inclusion of the 'real' type in D was a great idea, but std.math
should be overloaded for float/double/real so people have the
choice where they stand on the performance/precision front.

francesco cattoglio via Digitalmars-d

2014-06-28 09:47:52 UTC

Post by John Colvin via Digitalmars-d
*The number of algorithms that are both numerically
stable/correct and benefit
significantly from > 64bit doubles is very small.

To be blunt, baloney. I ran into these problems ALL THE TIME
when doing
professional numerical work.

Sorry for being so abrupt. FP is important to me - it's not
just about performance, it's also about accuracy.

When you need accuracy, 999 times out of 1000 you change the
numerical technique, you don't just blindly upgrade the precision.
The only real reason one would use 80 bits is when there is an
actual need of adding values which differ for more than 16 orders
of magnitude. And I've never seen this happen in any numerical
paper I've read.

Post by John Colvin via Digitalmars-d
I still maintain that the need for the precision of 80bit reals
is a niche demand. Its a very important niche, but it doesn't
justify having its relatively extreme requirements be the
default. Someone writing a matrix inversion has only themselves
to blame if they don't know plenty of numerical analysis and
look very carefully at the specifications of all operations
they are using.

Couldn't agree more. 80 bit IS a niche, which is really nice to
have, but shouldn't be the standard if we lose on performance.

Post by John Colvin via Digitalmars-d
Paying the cost of moving to/from the fpu, missing out on
increasingly large SIMD units, these make everyone pay the
price.

Especially the numerical analysts themselves will pay that price.
64 bit HAS to be as fast as possible, if you want to be
competitive when it comes to any kind of numerical work.

Walter Bright via Digitalmars-d

2014-06-28 10:42:21 UTC

When you need accuracy, 999 times out of 1000 you change the numerical
technique, you don't just blindly upgrade the precision.

I have experience doing numerical work? Upgrading the precision is the first
thing people try.

The only real reason one would use 80 bits is when there is an actual need of
adding values which differ for more than 16 orders of magnitude. And I've never
seen this happen in any numerical paper I've read.

It happens with both numerical integration and inverting matrices. Inverting
matrices is commonplace for solving N equations with N unknowns.

Errors accumulate very rapidly and easily overwhelm the significance of the answer.

Especially the numerical analysts themselves will pay that price. 64 bit HAS to
be as fast as possible, if you want to be competitive when it comes to any kind
of numerical work.

Getting the wrong answer quickly is not useful when you're calculating the
stress levels in a part.

Again, I've done numerical programming in airframe design. The correct answer is
what matters. You can accept wrong answers in graphics display algorithms, but
not when designing critical parts.

Russel Winder via Digitalmars-d

2014-06-28 10:57:52 UTC

On Sat, 2014-06-28 at 03:42 -0700, Walter Bright via Digitalmars-d

When you need accuracy, 999 times out of 1000 you change the numerical
technique, you don't just blindly upgrade the precision.

I have experience doing numerical work? Upgrading the precision is the first
thing people try.

Nonetheless, algorithm and expression of algorithm are often more
important. As proven by my Pi_Quadrature examples you can appear to have
better results with greater precision, but actually the way the code
operates is actually the core problem: the code I have written does not
do things in the best way to achieve the best result as a given accuracy
level..

[âŠ]

Post by Walter Bright via Digitalmars-d
Errors accumulate very rapidly and easily overwhelm the significance of the answer.

I wonder if programmers should only be allowed to use floating point
number sin their code if they have studied numerical analysis?

Especially the numerical analysts themselves will pay that price. 64 bit HAS to
be as fast as possible, if you want to be competitive when it comes to any kind
of numerical work.

Getting the wrong answer quickly is not useful when you're calculating the
stress levels in a part.

[âŠ]

Post by Walter Bright via Digitalmars-d
Again, I've done numerical programming in airframe design. The correct answer is
what matters. You can accept wrong answers in graphics display algorithms, but
not when designing critical parts.

Or indeed when calculating anything to do with money.
--
Russel.
=============================================================================
Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder at ekiga.net
41 Buckmaster Road m: +44 7770 465 077 xmpp: russel at winder.org.uk
London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: This is a digitally signed message part
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20140628/e8d8c82a/attachment.sig>

francesco cattoglio via Digitalmars-d

2014-06-28 11:27:35 UTC

Post by Walter Bright via Digitalmars-d
I have experience doing numerical work? Upgrading the precision
is the first thing people try.

Brute force is always the first thing people try :o)

Post by Walter Bright via Digitalmars-d
It happens with both numerical integration and inverting
matrices. Inverting matrices is commonplace for solving N
equations with N unknowns.
Errors accumulate very rapidly and easily overwhelm the
significance of the answer.

And that's exactly the reason you change approach instead of
getting greater precision: the "adding precision" approach scales
horribly, at least in my field of study, which is solving
numerical PDEs.
(BTW: no sane person inverts matrices)

Post by Walter Bright via Digitalmars-d
Getting the wrong answer quickly is not useful when you're
calculating the stress levels in a part.

We are talking about paying a price when you don't need it. With
the correct approach, solving numerical problems with double
precision floats yelds perfectly fine results. And it is, in
fact, commonplace.

Again, I've not read yet a research paper in which it was clearly
stated that 64bit floats were not good enough for solving a whole
class of PDE problem. I'm not saying that real is useless, quite
the opposite: I love the idea of having an extra tool when the
need arises. I think the focus should be about not paying a price
for what you don't use

Walter Bright via Digitalmars-d

2014-06-29 00:22:04 UTC

Post by francesco cattoglio via Digitalmars-d
We are talking about paying a price when you don't need it.

More than that - the suggestion has come up here (and comes up repeatedly) to
completely remove support for 80 bits. Heck, Microsoft has done so with VC++ and
even once attempted to completely remove it from 64 bit Windows (I talked them
out of it, you can thank me!).

Post by francesco cattoglio via Digitalmars-d
With the correct
approach, solving numerical problems with double precision floats yelds
perfectly fine results. And it is, in fact, commonplace.

Presuming your average mechanical engineer is well versed in how to do matrix
inversion while accounting for precision problems is an absurd pipe dream.

Most engineers only know their math book algorithms, not comp sci best practices.

Heck, few CS graduates know how to do it.

Post by francesco cattoglio via Digitalmars-d
Again, I've not read yet a research paper in which it was clearly stated that
64bit floats were not good enough for solving a whole class of PDE problem. I'm
not saying that real is useless, quite the opposite: I love the idea of having
an extra tool when the need arises. I think the focus should be about not paying
a price for what you don't use

I used to work doing numerical analysis on airplane parts. I didn't need a
research paper to discover how much precision matters and when my results fell
apart.

francesco cattoglio via Digitalmars-d

2014-06-29 07:57:56 UTC

Post by francesco cattoglio via Digitalmars-d
We are talking about paying a price when you don't need it.

More than that - the suggestion has come up here (and comes up
repeatedly) to completely remove support for 80 bits. Heck,
Microsoft has done so with VC++ and even once attempted to
completely remove it from 64 bit Windows (I talked them out of
it, you can thank me!).

Then I must have missed the post. Removing 80 bit support would
sound like madness to my ears.
And about that Microsoft thing, thanks a lot :o)

Walter Bright via Digitalmars-d

2014-06-29 09:46:11 UTC

Post by francesco cattoglio via Digitalmars-d
And about that Microsoft thing, thanks a lot :o)

Welcs!

Andrei Alexandrescu via Digitalmars-d

2014-06-28 14:01:13 UTC

Inverting matrices is commonplace for solving N equations with N
unknowns.

Actually nobody does that.

Also, one consideration is that the focus of numeric work changes with
time; nowadays it's all about machine learning, a field that virtually
didn't exist 20 years ago. In machine learning precision does make a
difference sometimes, but the key to good ML work is to run many
iterations over large data sets - i.e., speed.

I have an alarm go off when someone proffers a very strong conviction.
Very strong convictions means there is no listening to any argument
right off the bat, which locks out any reasonable discussion before it
even begins.

For better or worse modern computing units have focused on 32- and
64-bit float, leaving 80-bit floats neglected. I think it's time to
accept that simple fact and act on it, instead of claiming we're the
best in the world at FP math while everybody else speeds by.

Andrei

John Colvin via Digitalmars-d

2014-06-28 14:15:35 UTC

On Saturday, 28 June 2014 at 14:01:13 UTC, Andrei Alexandrescu

Inverting matrices is commonplace for solving N equations with N
unknowns.

Actually nobody does that.
Also, one consideration is that the focus of numeric work
changes with time; nowadays it's all about machine learning

It's the most actively publicised frontier, perhaps, but there's
a huge amount of solid work happening elsewhere. People still
need better fluid, molecular dynamics etc. simulations, numerical
PDE solvers, finite element modelling and so on. There's a whole
world out there :)

That doesn't diminish your main point though.

Post by Andrei Alexandrescu via Digitalmars-d
For better or worse modern computing units have focused on 32-
and 64-bit float, leaving 80-bit floats neglected. I think it's
time to accept that simple fact and act on it, instead of
claiming we're the best in the world at FP math while everybody
else speeds by.
Andrei

Walter Bright via Digitalmars-d

2014-06-29 00:33:52 UTC

Inverting matrices is commonplace for solving N equations with N
unknowns.

Actually nobody does that.

I did that at Boeing when doing analysis of the movement of the control
linkages. The traditional way it had been done before was using paper and pencil
with drafting tools - I showed how it could be done with matrix math.

Post by Andrei Alexandrescu via Digitalmars-d
I have an alarm go off when someone proffers a very strong conviction. Very
strong convictions means there is no listening to any argument right off the
bat, which locks out any reasonable discussion before it even begins.

So far, everyone here has dismissed my experienced out of hand. You too, with
"nobody does that". I don't know how anyone here can make such a statement. How
many of us have worked in non-programming engineering shops, besides me?

Post by Andrei Alexandrescu via Digitalmars-d
For better or worse modern computing units have focused on 32- and 64-bit float,
leaving 80-bit floats neglected.

Yep, for the game/graphics industry. Modern computing has also produced crappy
trig functions with popular C compilers, because nobody using C cares about
accurate answers (or they just assume what they're getting is correct - even worse).

Post by Andrei Alexandrescu via Digitalmars-d
I think it's time to accept that simple fact
and act on it, instead of claiming we're the best in the world at FP math while
everybody else speeds by.

Leaving us with a market opportunity for precision FP.

I note that even the title of this thread says nothing about accuracy, nor did
the benchmark attempt to assess if there was a difference in results.

Andrei Alexandrescu via Digitalmars-d

2014-06-29 03:39:53 UTC

Inverting matrices is commonplace for solving N equations with N
unknowns.

Actually nobody does that.

I did that at Boeing when doing analysis of the movement of the control
linkages. The traditional way it had been done before was using paper
and pencil with drafting tools - I showed how it could be done with
matrix math.

Pen on paper is a low baseline. The classic way to solve linear
equations with computers is to use Gaussian elimination methods adjusted
to cancel imprecision. (There are a number of more specialized methods.)

For really large equations with sparse matrices one uses the method of
relaxations.

So far, everyone here has dismissed my experienced out of hand. You too,
with "nobody does that". I don't know how anyone here can make such a
statement. How many of us have worked in non-programming engineering
shops, besides me?

My thesis - http://erdani.com/research/dissertation_color.pdf - and some
of my work at Facebook, which has been patented -
http://www.faqs.org/patents/app/20140046959 - use large matrix algebra
intensively.

Post by Andrei Alexandrescu via Digitalmars-d
For better or worse modern computing units have focused on 32- and 64-bit float,
leaving 80-bit floats neglected.

Yep, for the game/graphics industry. Modern computing has also produced
crappy trig functions with popular C compilers, because nobody using C
cares about accurate answers (or they just assume what they're getting
is correct - even worse).

Post by Andrei Alexandrescu via Digitalmars-d
I think it's time to accept that simple fact
and act on it, instead of claiming we're the best in the world at FP math while
everybody else speeds by.

Leaving us with a market opportunity for precision FP.
I note that even the title of this thread says nothing about accuracy,
nor did the benchmark attempt to assess if there was a difference in
results.

All I'm saying is that our convictions should be informed by, and
commensurate with, our expertise.

Andrei

Alex_Dovhal via Digitalmars-d

2014-06-28 15:31:36 UTC

if one wants better precision with solving linear equation he/she
at least would use QR-decomposition.

H. S. Teoh via Digitalmars-d

2014-06-28 18:16:56 UTC

Post by Walter Bright via Digitalmars-d
It happens with both numerical integration and inverting matrices.
Inverting matrices is commonplace for solving N equations with N
unknowns.
Errors accumulate very rapidly and easily overwhelm the significance of the answer.

if one wants better precision with solving linear equation he/she at
least would use QR-decomposition.

Yeah, inverting matrices is generally not the preferred method for
solving linear equations, precisely because of accumulated roundoff
errors. Usually one would use a linear algebra library which has
dedicated algorithms for solving linear systems, which extracts the
solution(s) using more numerically-stable methods than brute-force
matrix inversion. They are also more efficient than inverting the matrix
and then doing a matrix multiplication to get the solution vector.
Mathematically, they are equivalent to matrix inversion, but numerically
they are more stable and not as prone to precision loss issues.

Having said that, though, added precision is always welcome,
particularly when studying mathematical objects (as opposed to more
practical applications like engineering, where 6-8 digits of precision
in the result is generally more than good enough). Of course, the most
ideal implementation would be to use algebraic representations that can
represent quantities exactly, but exact representations are not always
practical (they are too slow for very large inputs, or existing
libraries only support hardware floating-point types, or existing code
requires a lot of effort to support software arbitrary-precision
floats). In such cases, squeezing as much precision out of your hardware
as possible is a good first step towards a solution.

T

--
Time flies like an arrow. Fruit flies like a banana.

Walter Bright via Digitalmars-d

2014-06-29 00:36:22 UTC

Post by H. S. Teoh via Digitalmars-d
(as opposed to more
practical applications like engineering, where 6-8 digits of precision
in the result is generally more than good enough).

Of the final result, sure, but NOT for the intermediate results. It is an utter
fallacy to conflate required precision of the result with precision of the
intermediate results.

Walter Bright via Digitalmars-d

2014-06-29 00:16:53 UTC

Post by Russel Winder via Digitalmars-d
I wonder if programmers should only be allowed to use floating point
number sin their code if they have studied numerical analysis?

Be that as it may, why should a programming language make it harder for them to
get right than necessary?

The first rule in doing numerical calculations, hammered into me at Caltech, is
use the max precision available at every step. Rounding error is a major
problem, and is very underappreciated by engineers until they have a big screwup.

The idea that "64 fp bits ought to be enough for anybody" is a pernicious
disaster, to put it mildly.

Post by Russel Winder via Digitalmars-d
Or indeed when calculating anything to do with money.

You're better off using 64 bit longs counting cents to represent money than
using floating point. But yeah, counting money has its own special problems.

H. S. Teoh via Digitalmars-d

2014-06-29 04:36:54 UTC

[...]

Post by Russel Winder via Digitalmars-d
Or indeed when calculating anything to do with money.

You're better off using 64 bit longs counting cents to represent money
than using floating point. But yeah, counting money has its own
special problems.

For counting money, I heard that the recommendation is to use
fixed-point arithmetic (i.e. integer values in cents).

T

--
The best compiler is between your ears. -- Michael Abrash

deadalnix via Digitalmars-d

2014-06-29 04:46:47 UTC

On Sunday, 29 June 2014 at 04:38:31 UTC, H. S. Teoh via

On Sat, Jun 28, 2014 at 05:16:53PM -0700, Walter Bright via
[...]

Post by Russel Winder via Digitalmars-d
Or indeed when calculating anything to do with money.

You're better off using 64 bit longs counting cents to
represent money
than using floating point. But yeah, counting money has its own
special problems.

For counting money, I heard that the recommendation is to use
fixed-point arithmetic (i.e. integer values in cents).
T

MtGox was using float.

Paolo Invernizzi via Digitalmars-d

2014-06-29 08:44:22 UTC

Post by deadalnix via Digitalmars-d
On Sunday, 29 June 2014 at 04:38:31 UTC, H. S. Teoh via

On Sat, Jun 28, 2014 at 05:16:53PM -0700, Walter Bright via
[...]

Post by Russel Winder via Digitalmars-d
Or indeed when calculating anything to do with money.

You're better off using 64 bit longs counting cents to
represent money
than using floating point. But yeah, counting money has its
own
special problems.

For counting money, I heard that the recommendation is to use
fixed-point arithmetic (i.e. integer values in cents).
T

MtGox was using float.

LOL ;-)

---
Paolo

Sean Kelly via Digitalmars-d

2014-06-29 05:21:52 UTC

Post by Russel Winder via Digitalmars-d
Or indeed when calculating anything to do with money.

You're better off using 64 bit longs counting cents to
represent money than using floating point. But yeah, counting
money has its own special problems.

Maybe if by "money" you mean dollars in the bank. But for
anything much beyond that you're doing floating point math.
Often with specific rules for how and when rounding should occur.
Perhaps interestingly, it's typical for hedge funds to have a
"rounding partner" who receives all the fractional pennies that
are lost when divvying up the income for the other investors.

Walter Bright via Digitalmars-d

2014-06-29 06:34:59 UTC

Post by Russel Winder via Digitalmars-d
Or indeed when calculating anything to do with money.

You're better off using 64 bit longs counting cents to represent money
than using floating point. But yeah, counting money has its own
special problems.

For counting money, I heard that the recommendation is to use
fixed-point arithmetic (i.e. integer values in cents).

I think that's what I said :-)

Andrei Alexandrescu via Digitalmars-d

2014-06-29 14:59:42 UTC

Post by Russel Winder via Digitalmars-d
Or indeed when calculating anything to do with money.

You're better off using 64 bit longs counting cents to represent money
than using floating point. But yeah, counting money has its own
special problems.

For counting money, I heard that the recommendation is to use
fixed-point arithmetic (i.e. integer values in cents).

A friend who works at a hedge fund (after making the rounds to the NYC
large financial companies) told me that's a myth. Any nontrivial
calculation involving money (interest, fixed income, derivatives, ...)
needs floating point. He never needed more than double.

Andrei

Iain Buclaw via Digitalmars-d

2014-06-29 15:50:52 UTC

On 29 June 2014 15:59, Andrei Alexandrescu via Digitalmars-d

Post by Russel Winder via Digitalmars-d
Or indeed when calculating anything to do with money.

You're better off using 64 bit longs counting cents to represent money
than using floating point. But yeah, counting money has its own
special problems.

For counting money, I heard that the recommendation is to use
fixed-point arithmetic (i.e. integer values in cents).

A friend who works at a hedge fund (after making the rounds to the NYC large
financial companies) told me that's a myth. Any nontrivial calculation
involving money (interest, fixed income, derivatives, ...) needs floating
point. He never needed more than double.
Andrei

I would have thought money would use fixed point decimal floats.

Iain

David Nadlinger via Digitalmars-d

2014-06-29 16:05:15 UTC

On Sunday, 29 June 2014 at 15:51:03 UTC, Iain Buclaw via

I would have thought money would use fixed point [âŠ] floats.

Huh? ;)

David

Andrei Alexandrescu via Digitalmars-d

2014-06-29 16:54:44 UTC

Post by Iain Buclaw via Digitalmars-d
On 29 June 2014 15:59, Andrei Alexandrescu via Digitalmars-d

Post by Russel Winder via Digitalmars-d
Or indeed when calculating anything to do with money.

You're better off using 64 bit longs counting cents to represent money
than using floating point. But yeah, counting money has its own
special problems.

For counting money, I heard that the recommendation is to use
fixed-point arithmetic (i.e. integer values in cents).

A friend who works at a hedge fund (after making the rounds to the NYC large
financial companies) told me that's a myth. Any nontrivial calculation
involving money (interest, fixed income, derivatives, ...) needs floating
point. He never needed more than double.
Andrei

I would have thought money would use fixed point decimal floats.

And what meaningful computation can you do with such? Using fixed point
for money would be like the guy in Walter's story rounding to two
decimals after each step in the calculation.

Even for a matter as simple as average price for a share bought in
multiple batches you need floating point.

Andrei

Walter Bright via Digitalmars-d

2014-06-29 19:17:23 UTC

Post by Iain Buclaw via Digitalmars-d
On 29 June 2014 15:59, Andrei Alexandrescu via Digitalmars-d

Post by Russel Winder via Digitalmars-d
Or indeed when calculating anything to do with money.

You're better off using 64 bit longs counting cents to represent money
than using floating point. But yeah, counting money has its own
special problems.

For counting money, I heard that the recommendation is to use
fixed-point arithmetic (i.e. integer values in cents).

A friend who works at a hedge fund (after making the rounds to the NYC large
financial companies) told me that's a myth. Any nontrivial calculation
involving money (interest, fixed income, derivatives, ...) needs floating
point. He never needed more than double.
Andrei

I would have thought money would use fixed point decimal floats.

And what meaningful computation can you do with such? Using fixed point for
money would be like the guy in Walter's story rounding to two decimals after
each step in the calculation.
Even for a matter as simple as average price for a share bought in multiple
batches you need floating point.

I can see using floating point for the calculation, but the final result should
be stored as whole pennies.

Russel Winder via Digitalmars-d

2014-06-29 18:13:43 UTC

On Sun, 2014-06-29 at 07:59 -0700, Andrei Alexandrescu via Digitalmars-d
wrote:
[âŠ]

Post by Andrei Alexandrescu via Digitalmars-d
A friend who works at a hedge fund (after making the rounds to the NYC
large financial companies) told me that's a myth. Any nontrivial
calculation involving money (interest, fixed income, derivatives, ...)
needs floating point. He never needed more than double.

Very definitely so. Fixed point or integer arithmetic for simple
"household" finance fair enough, but for "finance house" calculations
you generally need 22+ significant denary digits to meet with compliance
requirements.

Walter Bright via Digitalmars-d

2014-06-29 19:18:48 UTC

Post by Russel Winder via Digitalmars-d
On Sun, 2014-06-29 at 07:59 -0700, Andrei Alexandrescu via Digitalmars-d
[âŠ]

Doubles are only good to 17 digits, and even that 17th digit is flaky.

David Nadlinger via Digitalmars-d

2014-06-29 19:28:41 UTC

Post by Russel Winder via Digitalmars-d
Very definitely so. Fixed point or integer arithmetic for
simple
"household" finance fair enough, but for "finance house"
calculations
you generally need 22+ significant denary digits to meet with
compliance
requirements.

Doubles are only good to 17 digits, and even that 17th digit is flaky.

The 11 extra bits in an x87 real wouldn't get you to 22 either,
though. ;)

David

Russel Winder via Digitalmars-d

2014-06-29 21:33:45 UTC

On Sun, 2014-06-29 at 12:18 -0700, Walter Bright via Digitalmars-d
wrote:
[âŠ]

Post by Walter Bright via Digitalmars-d
Doubles are only good to 17 digits, and even that 17th digit is flaky.

Hence the use of software "real" numbers becoming the norm for
calculating these bioinformatics and quant models.

(I rarely see better that 14 or 15 denary digits of accuracy for 64-bit
fp hardware. :-(
--
Russel.
=============================================================================
Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder at ekiga.net
41 Buckmaster Road m: +44 7770 465 077 xmpp: russel at winder.org.uk
London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: This is a digitally signed message part
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20140629/ac409568/attachment.sig>

Andrei Alexandrescu via Digitalmars-d

2014-06-29 22:37:16 UTC

Post by Russel Winder via Digitalmars-d
On Sun, 2014-06-29 at 07:59 -0700, Andrei Alexandrescu via Digitalmars-d
[âŠ]

I don't know of US regulations that ask for such.

What I do know is I gave my hedge fund friend a call (today is his name
day so it was as good a pretext as any) and mentioned that some people
believe fixed point is used in finance. His answer was:

BWAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAHAAAAAAAAAAAAAAAAAAAAAAA!

I asked about how they solve accumulating numeric errors and he said
it's on a case basis. Most of the time it's pennies for billions of
dollars, so nobody cares. Sometimes there are reconciliations needed -
so called REC's - that compare and adjust outputs of different algorithms.

One nice war story he recalled: someone was storing the number of
seconds as a double, and truncate it to int where needed. An error of at
most one second wasn't important in the context. However, sometimes the
second was around midnight so an error of one second was an error of one
day, which was significant. The solution was to use rounding instead of
truncation.

Andrei

Russel Winder via Digitalmars-d

2014-06-28 10:33:44 UTC

On Sat, 2014-06-28 at 09:07 +0000, John Colvin via Digitalmars-d wrote:
[âŠ]

Post by John Colvin via Digitalmars-d
I still maintain that the need for the precision of 80bit reals
is a niche demand. Its a very important niche, but it doesn't
justify having its relatively extreme requirements be the
default. Someone writing a matrix inversion has only themselves
to blame if they don't know plenty of numerical analysis and look
very carefully at the specifications of all operations they are
using.

I fear the whole argument is getting misguided. We should reset.

If you are doing numerical calculations then accuracy is critical.
Arbitrary precision floats are the only real (!) way of doing any
numeric non-integer calculation, and arbitrary precision integers are
the only way of doing integer calculations.

However speed is also an issue, so to obtain speed we have hardware
integer and floating point ALUs.

The cost for the integer ALU is bounded integers. Python appreciates
this and uses hardware integers when it can and software integers
otherwise. Thus Python is very good for doing integer work. C, C++, Go,
D, Fortran, etc. are fundamentally crap for integer calculation because
integers are bounded. Of course if calculations are prvably within the
hardware integer bounds this is not a constraint and we are happy with
hardware integers. Just don't try calculating factorial, Fibonacci
numbers and other numbers used in some bioinformatics and quant models.
There is a reason why SciPy has a massive following in bioinformatics
and quant comuting.

The cost for floating point ALU is accuracy. Hardware floating point
numbers are dreadful in that sense, but again the issue is speed and for
GPU they went 32-bit for speed. Now they are going 64-bit as they can
just about get the same speed and the accuracy is so much greater. For
hardware floating point the more bits you have the better. Hence IBM in
the 360 and later having 128-bit floating point for accuracy at the
expense of some speed. Sun had 128-bit in the SPARC processors for
accuracy at the expense of a little speed.

As Walter has or will tell us, C (and thus C++) got things woefully
wrong in support of numerical work because the inventors were focused on
writing operating systems, supporting only PDP hardware. They and the
folks that then wrote various algorithms didn't really get numerical
analysis. If C had targeted IBM 360 from the outset things might have
been better.

We have to be clear on this: Fortran is the only language that supports
hardware floating types even at all well.

Intel's 80-bit floating point were an aberration, they should just have
done 128-bit in the first place. OK so they got the 80-bit stuff as a
sort of free side-effect of creating 64-bit, but they ran with. They
shouldn't have done. I cannot see it ever happening again. cf. ARM.

By being focused on Intel chips, D has failed to get floating point
correct in avery analogous way to C failing to get floating point types
right by focusing on PDP. Yes using 80-bit on Intel is good, but no-one
else has this. Floating point sizes should be 32-, 64-, 128-, 256-bit,
etc. D needs to be able to handle this. So does C, C++, Java, etc. Go
will be able to handle it when it is ported to appropriate hardware as
they use float32, float64, etc. as their types. None of this float,
double, long double, double double rubbish.

So D should perhaps make a breaking change and have types int32, int64,
float32, float64, float80, and get away from the vagaries of bizarre
type relationships with hardware?
--
Russel.
=============================================================================
Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder at ekiga.net
41 Buckmaster Road m: +44 7770 465 077 xmpp: russel at winder.org.uk
London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: This is a digitally signed message part
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20140628/dc179429/attachment.sig>

John Colvin via Digitalmars-d

2014-06-28 11:50:17 UTC

On Saturday, 28 June 2014 at 10:34:00 UTC, Russel Winder via

Post by Russel Winder via Digitalmars-d
So D should perhaps make a breaking change and have types
int32, int64,
float32, float64, float80, and get away from the vagaries of
bizarre
type relationships with hardware?

`real`* is the only builtin numerical type in D that doesn't have
a defined width. http://dlang.org/type.html

*well I guess there's size_t and ptrdiff_t, but they aren't
distinct types in their own right.

Element 126 via Digitalmars-d

2014-06-28 12:43:02 UTC

Post by Russel Winder via Digitalmars-d
[âŠ]

Post by John Colvin via Digitalmars-d
I still maintain that the need for the precision of 80bit reals
is a niche demand. Its a very important niche, but it doesn't
justify having its relatively extreme requirements be the
default. Someone writing a matrix inversion has only themselves
to blame if they don't know plenty of numerical analysis and look
very carefully at the specifications of all operations they are
using.

I fear the whole argument is getting misguided. We should reset.
If you are doing numerical calculations then accuracy is critical.
Arbitrary precision floats are the only real (!) way of doing any
numeric non-integer calculation, and arbitrary precision integers are
the only way of doing integer calculations.
However speed is also an issue, so to obtain speed we have hardware
integer and floating point ALUs.
The cost for the integer ALU is bounded integers. Python appreciates
this and uses hardware integers when it can and software integers
otherwise. Thus Python is very good for doing integer work. C, C++, Go,
D, Fortran, etc. are fundamentally crap for integer calculation because
integers are bounded. Of course if calculations are prvably within the
hardware integer bounds this is not a constraint and we are happy with
hardware integers. Just don't try calculating factorial, Fibonacci
numbers and other numbers used in some bioinformatics and quant models.
There is a reason why SciPy has a massive following in bioinformatics
and quant comuting.
The cost for floating point ALU is accuracy. Hardware floating point
numbers are dreadful in that sense, but again the issue is speed and for
GPU they went 32-bit for speed. Now they are going 64-bit as they can
just about get the same speed and the accuracy is so much greater. For
hardware floating point the more bits you have the better. Hence IBM in
the 360 and later having 128-bit floating point for accuracy at the
expense of some speed. Sun had 128-bit in the SPARC processors for
accuracy at the expense of a little speed.
As Walter has or will tell us, C (and thus C++) got things woefully
wrong in support of numerical work because the inventors were focused on
writing operating systems, supporting only PDP hardware. They and the
folks that then wrote various algorithms didn't really get numerical
analysis. If C had targeted IBM 360 from the outset things might have
been better.
We have to be clear on this: Fortran is the only language that supports
hardware floating types even at all well.
Intel's 80-bit floating point were an aberration, they should just have
done 128-bit in the first place. OK so they got the 80-bit stuff as a
sort of free side-effect of creating 64-bit, but they ran with. They
shouldn't have done. I cannot see it ever happening again. cf. ARM.
By being focused on Intel chips, D has failed to get floating point
correct in avery analogous way to C failing to get floating point types
right by focusing on PDP. Yes using 80-bit on Intel is good, but no-one
else has this. Floating point sizes should be 32-, 64-, 128-, 256-bit,
etc. D needs to be able to handle this. So does C, C++, Java, etc. Go
will be able to handle it when it is ported to appropriate hardware as
they use float32, float64, etc. as their types. None of this float,
double, long double, double double rubbish.
So D should perhaps make a breaking change and have types int32, int64,
float32, float64, float80, and get away from the vagaries of bizarre
type relationships with hardware?

+1 for float32 & cie. These names are much more explicit than the
current ones. But I see two problems with it :

- These names are already used in core.simd to denote vectors, and AVX
3 (which should appear in mainstream CPUs next year) will require to use
float16, so the next revision might cause a collision. This could be
avoided by using real32, real64... instead, but I prefer floatxx since
it reminds us that we are not dealing with an exact real number.

- These types are redundant, and people coming from C/C++ will likely
use float and double instead. It's much too late to think of deprecating
them since it would break backward compatibility (although it would be
trivial to update the code with "DFix"... if someone is still
maintaining the code).

A workaround would be to use a template which maps to the correct native
type, iff it has the exact number of bits specified, or issues an error.
Here is a quick mockup (does not support all types). I used "fp" instead
of "float" or "real" to avoid name collisions with the current types.

template fp(uint n) {

static if (n == 32) {
alias fp = float;
} else static if (n == 64) {
alias fp = double;
} else static if (n == 80) {
static if (real.mant_dig == 64) {
alias fp = real;
} else {
static assert(false, "No 80 bit floating point
type supported on this architecture");
}
} else static if (n == 128) {
alias fp = quadruple; // Or doubledouble on PPC. Add
other static ifs if necessary.
} else {
import std.conv: to;
static assert(false, "No "~to!string(n)~" bit floating
point type.");
}
}

void main() {

fp!32 x = 3.1415926;
assert(is(typeof(x) == float));

fp!64 y = 3.141592653589793;
assert(is(typeof(y) == double));

fp!80 z = 3.14159265358979323846;
assert(is(typeof(z) == real));

/* Fails on x86_64, as it should, but the error message could
be made more explicit.
* Currently : "undefined identifier quadruple"
* Should ideally be : "No native 128 bit floating-point type
supported on x86_64 architecture."
*/
/*
fp!128 w = 3.14159265358979323846264338327950288;
assert(is(typeof(w) == quadruple));
*/
}

Walter Bright via Digitalmars-d

2014-06-29 00:41:33 UTC

+1 for float32 & cie. These names are much more explicit than the current ones.

I don't see any relevance to this discussion with whether 32 bit floats are
named 'float' or 'float32'.

Russel Winder via Digitalmars-d

2014-06-29 18:16:52 UTC

On Sat, 2014-06-28 at 17:41 -0700, Walter Bright via Digitalmars-d

+1 for float32 & cie. These names are much more explicit than the current ones.

I don't see any relevance to this discussion with whether 32 bit floats are
named 'float' or 'float32'.

This is getting way off the original thread, butâŠ

The issue is what hardware representations are supported: what does
float mean? This is a Humpty Dumpty situation and "something must be
done". Hence Go stops with the undefined words and gives definite global
meanings to type names. It would be helpful if D eschewed the C/C++
heritage as well and got more definite about type names.

David Nadlinger via Digitalmars-d

2014-06-29 19:02:01 UTC

On Sunday, 29 June 2014 at 18:17:06 UTC, Russel Winder via

Post by Russel Winder via Digitalmars-d
This is getting way off the original thread, butâŠ
The issue is what hardware representations are supported: what
does
float mean? This is a Humpty Dumpty situation and "something
must be
done". Hence Go stops with the undefined words and gives
definite global
meanings to type names. It would be helpful if D eschewed the
C/C++
heritage as well and got more definite about type names.

There is nothing Humpty Dumpty about the current situation. You
are simply missing the fact that float and double are already
defined as 32 bit/64 bit IEEE 754 compliant floating point
numbers in the spec.

There is nothing ambiguous about that, just as char/int/long have
defined bit-widths in D.

David

Walter Bright via Digitalmars-d

2014-06-29 19:19:57 UTC

Post by Russel Winder via Digitalmars-d
This is getting way off the original thread, butâŠ
The issue is what hardware representations are supported: what does
float mean? This is a Humpty Dumpty situation and "something must be
done". Hence Go stops with the undefined words and gives definite global
meanings to type names. It would be helpful if D eschewed the C/C++
heritage as well and got more definite about type names.

There is nothing Humpty Dumpty about the current situation. You are simply
missing the fact that float and double are already defined as 32 bit/64 bit IEEE
754 compliant floating point numbers in the spec.
There is nothing ambiguous about that, just as char/int/long have defined
bit-widths in D.

Exactly. C/C++ has implementation-defined types, but D types are nailed down.

Russel Winder via Digitalmars-d

2014-06-29 21:30:29 UTC

On Sun, 2014-06-29 at 19:02 +0000, David Nadlinger via Digitalmars-d
wrote:
[âŠ]

Post by David Nadlinger via Digitalmars-d
There is nothing Humpty Dumpty about the current situation. You
are simply missing the fact that float and double are already
defined as 32 bit/64 bit IEEE 754 compliant floating point
numbers in the spec.
There is nothing ambiguous about that, just as char/int/long have
defined bit-widths in D.

I think I am probably just getting "bloody minded" here, butâŠ

If D is a language that uses the underlying hardware representation then
it cannot define the use of specific formats for hardware numbers. Thus,
on hardware that provides IEEE754 format hardware float and double can
map to the 32-bit and 64-bit IEEE754 numbers offered. However if the
hardware does not provide IEEE754 hardware then either D must interpret
floating point expressions (as per Java) or it cannot be ported to that
architecture. cf. IBM 360.

Fortunately more recent IBM hardware has multiple FPUs per core, one of
which provides IEEE754 as an option. (Pity the other FPUs cannot be
used :-)

Corollary: if D defines the "hardware" representation in its data model
then it can only be ported to hardware that uses that representation.

PS Walter just wrote that the type real is not defined as float and
double are, so it does have a Humpty Dumpty factor even if float and
double do not.
--
Russel.
=============================================================================
Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder at ekiga.net
41 Buckmaster Road m: +44 7770 465 077 xmpp: russel at winder.org.uk
London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: This is a digitally signed message part
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20140629/440bc3d5/attachment.sig>

Walter Bright via Digitalmars-d

2014-06-29 22:31:50 UTC

Post by Russel Winder via Digitalmars-d
If D is a language that uses the underlying hardware representation then
it cannot define the use of specific formats for hardware numbers. Thus,
on hardware that provides IEEE754 format hardware float and double can
map to the 32-bit and 64-bit IEEE754 numbers offered. However if the
hardware does not provide IEEE754 hardware then either D must interpret
floating point expressions (as per Java) or it cannot be ported to that
architecture. cf. IBM 360.

That's correct. The D spec says IEEE 754.

Post by Russel Winder via Digitalmars-d
PS Walter just wrote that the type real is not defined as float and
double are, so it does have a Humpty Dumpty factor even if float and
double do not.

It's still IEEE, just the longer lengths if they exist on the hardware.

D is not unique in requiring IEEE 754 floats - Java does, too. So does Javascript.

deadalnix via Digitalmars-d

2014-06-29 00:11:47 UTC

Post by John Colvin via Digitalmars-d
*The number of algorithms that are both numerically
stable/correct and benefit
significantly from > 64bit doubles is very small.

To be blunt, baloney. I ran into these problems ALL THE TIME
when doing
professional numerical work.

Sorry for being so abrupt. FP is important to me - it's not
just about performance, it's also about accuracy.

I still maintain that the need for the precision of 80bit reals
is a niche demand. Its a very important niche, but it doesn't
justify having its relatively extreme requirements be the
default. Someone writing a matrix inversion has only themselves
to blame if they don't know plenty of numerical analysis and
look very carefully at the specifications of all operations
they are using.
Paying the cost of moving to/from the fpu, missing out on
increasingly large SIMD units, these make everyone pay the
price.
inclusion of the 'real' type in D was a great idea, but
std.math should be overloaded for float/double/real so people
have the choice where they stand on the performance/precision
front.

Would thar make sense to have std.mast and std.fastmath, or
something along these lines ?

Manu via Digitalmars-d

2014-06-30 04:16:06 UTC

On 29 June 2014 10:11, deadalnix via Digitalmars-d

*The number of algorithms that are both numerically stable/correct and benefit
significantly from > 64bit doubles is very small.

To be blunt, baloney. I ran into these problems ALL THE TIME when doing
professional numerical work.

Sorry for being so abrupt. FP is important to me - it's not just about
performance, it's also about accuracy.

I still maintain that the need for the precision of 80bit reals is a niche
demand. Its a very important niche, but it doesn't justify having its
relatively extreme requirements be the default. Someone writing a matrix
inversion has only themselves to blame if they don't know plenty of
numerical analysis and look very carefully at the specifications of all
operations they are using.
Paying the cost of moving to/from the fpu, missing out on increasingly
large SIMD units, these make everyone pay the price.
inclusion of the 'real' type in D was a great idea, but std.math should be
overloaded for float/double/real so people have the choice where they stand
on the performance/precision front.

Would thar make sense to have std.mast and std.fastmath, or something along
these lines ?

I've thought this too.
std.math and std.numeric maybe?

To me, 'fastmath' suggests comfort with approximations/estimates or
other techniques in favour of speed, and I don't think the non-'real'
version should presume that.
It's not that we have a 'normal' one and a 'fast' one. What we have is
a 'slow' one, and the other is merely normal; ie, "std.math".

Walter Bright via Digitalmars-d

2014-06-29 00:40:15 UTC

Post by Russel Winder via Digitalmars-d
By being focused on Intel chips, D has failed to get floating point
correct in avery analogous way to C failing to get floating point types
right by focusing on PDP.

Sorry, I do not follow the reasoning here.

Post by Russel Winder via Digitalmars-d
Yes using 80-bit on Intel is good, but no-one
else has this. Floating point sizes should be 32-, 64-, 128-, 256-bit,
etc. D needs to be able to handle this. So does C, C++, Java, etc. Go
will be able to handle it when it is ported to appropriate hardware as
they use float32, float64, etc. as their types. None of this float,
double, long double, double double rubbish.
So D should perhaps make a breaking change and have types int32, int64,
float32, float64, float80, and get away from the vagaries of bizarre
type relationships with hardware?

D's spec says that the 'real' type is the max size supported by the FP hardware.
How is this wrong?

Timon Gehr via Digitalmars-d

2014-06-29 01:14:17 UTC

Post by Russel Winder via Digitalmars-d
...
So D should perhaps make a breaking change and have types int32, int64,
float32, float64, float80, and get away from the vagaries of bizarre
type relationships with hardware?

D's spec says that the 'real' type is the max size supported by the FP
hardware. How is this wrong?

It is hardware-dependent.

Walter Bright via Digitalmars-d

2014-06-29 01:32:37 UTC

Post by Timon Gehr via Digitalmars-d

D's spec says that the 'real' type is the max size supported by the FP
hardware. How is this wrong?

It is hardware-dependent.

D does not require real to be 80 bits if the hardware does not support it.

Keep in mind that D is a systems programming language, and that implies you get
access to the hardware types.

Russel Winder via Digitalmars-d

2014-06-29 18:21:32 UTC

On Sat, 2014-06-28 at 17:40 -0700, Walter Bright via Digitalmars-d

Sorry, I do not follow the reasoning here.

By being focused on specific hardware, you create names for types that
do not port to other hardware. C and C++ really do not work well on IBM
hardware because the PDP heritage of the type names does not port from
PDP to IBM hardware.

D's spec says that the 'real' type is the max size supported by the FP hardware.
How is this wrong?

Because when reading the code you haven't got a f####### clue how
accurate the floating point number is until you ask and answer the
question "and which processor are you running this code on".

In many ways this is a trivial issue given C and C++ heritage, on the
other hand Go and other languages are changing the game such that C and
C++ thinking is being left behind.

Walter Bright via Digitalmars-d

2014-06-29 19:22:15 UTC

Post by Russel Winder via Digitalmars-d
Because when reading the code you haven't got a f####### clue how
accurate the floating point number is until you ask and answer the
question "and which processor are you running this code on".

That is not true with D. D specifies that float and double are IEEE 754 types
which have specified size and behavior. D's real type is the largest the
underlying hardware will support.

D also specifies 'int' is 32 bits, 'long' is 64, and 'byte' is 8, 'short' is 16.

John Colvin via Digitalmars-d

2014-06-29 21:04:06 UTC

Post by Russel Winder via Digitalmars-d
Because when reading the code you haven't got a f####### clue
how
accurate the floating point number is until you ask and answer the
question "and which processor are you running this code on".

That is not true with D. D specifies that float and double are
IEEE 754 types which have specified size and behavior. D's real
type is the largest the underlying hardware will support.
D also specifies 'int' is 32 bits, 'long' is 64, and 'byte' is
8, 'short' is 16.

I'm afraid that it is exactly true if you use `real`.

What important use-case is there for using `real` that shouldn't
also be accompanied by a `static assert(real.sizeof >= 10);` or
similar, for correctness reasons?

Assuming there isn't one, then what is the point of having a type
with hardware dependant precision? Isn't it just a useless
abstraction over the hardware that obscures useful intent?

mixin(`alias real` ~ (real.sizeof*8).stringof ~ ` = real;`);

is more useful to me.

Iain Buclaw via Digitalmars-d

2014-06-29 21:28:19 UTC

On 29 June 2014 22:04, John Colvin via Digitalmars-d

That is not true with D. D specifies that float and double are IEEE 754
types which have specified size and behavior. D's real type is the largest
the underlying hardware will support.
D also specifies 'int' is 32 bits, 'long' is 64, and 'byte' is 8, 'short' is 16.

I'm afraid that it is exactly true if you use `real`.

There seems to be a circular argument going round here, it's tiring
bringing up the same point over and over again.

Post by John Colvin via Digitalmars-d
What important use-case is there for using `real` that shouldn't also be
accompanied by a `static assert(real.sizeof >= 10);` or similar, for
correctness reasons?

Breaks portability. There is just too much code out there that uses
real, and besides druntime/phobos math has already been ported to
handle all cases where real == 64bits.

Post by John Colvin via Digitalmars-d
Assuming there isn't one, then what is the point of having a type with
hardware dependant precision? Isn't it just a useless abstraction over the
hardware that obscures useful intent?
mixin(`alias real` ~ (real.sizeof*8).stringof ~ ` = real;`);

Good luck guessing which one to use. On GDC you have a choice of
three or four depending on what the default -m flags are. ;)

Walter Bright via Digitalmars-d

2014-06-29 22:34:46 UTC

Assuming there isn't one, then what is the point of having a type with hardware
dependant precision?

The point is D is a systems programming language, and the D programmer should
not be locked out of the hardware capabilities of the system he is running on.

D should not be constrained to be the least common denominator of all and future
processors.

Russel Winder via Digitalmars-d

2014-06-29 21:45:03 UTC

Hopefully there are points here for pedantry and bloody mindednessâŠ

On Sat, 2014-06-28 at 18:32 -0700, Walter Bright via Digitalmars-d
wrote:
[âŠ]

Post by Walter Bright via Digitalmars-d
Keep in mind that D is a systems programming language, and that

implies you get

Post by Walter Bright via Digitalmars-d
access to the hardware types.

On Sun, 2014-06-29 at 12:22 -0700, Walter Bright via Digitalmars-d
wrote:
[âŠ].

Post by Walter Bright via Digitalmars-d
That is not true with D. D specifies that float and double are IEEE 754 types
which have specified size and behavior. D's real type is the largest the
underlying hardware will support.
D also specifies 'int' is 32 bits, 'long' is 64, and 'byte' is 8, 'short' is 16.

D gives access to the hardware types, and D defines the structure of all
those types. The only resolution is that D only works on that hardware
where the hardware types are the ones D defines. Thus D only works on a
subset of hardware, and can never be ported to hardware where the
hardware types differ from those defined by D.

So D float and double will not work on IBM 360 unless interpreted, and
real would be 128-bit (not IEEE)?

The D real type definitely suffers the C/C++ float and double problem!

I guess we just hope that all future hardware is IEEE754 compliant.

(This is both a trivial issue and a brick wall issue so let's keep thing
humour-ful!)
--
Russel.
=============================================================================
Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder at ekiga.net
41 Buckmaster Road m: +44 7770 465 077 xmpp: russel at winder.org.uk
London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: This is a digitally signed message part
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20140629/d30bf851/attachment.sig>

Iain Buclaw via Digitalmars-d

2014-06-29 22:18:43 UTC

On 29 June 2014 22:45, Russel Winder via Digitalmars-d

Post by Russel Winder via Digitalmars-d
Hopefully there are points here for pedantry and bloody mindednessâŠ
On Sat, 2014-06-28 at 18:32 -0700, Walter Bright via Digitalmars-d
[âŠ]

Post by Walter Bright via Digitalmars-d
Keep in mind that D is a systems programming language, and that

implies you get

Post by Walter Bright via Digitalmars-d
access to the hardware types.

On Sun, 2014-06-29 at 12:22 -0700, Walter Bright via Digitalmars-d
[âŠ].

I'm sure it isn't as bad as you describe for floor and double. And if
it is, then I can't see GCC (C/C++) working on IBM 360 either, without
some out-of-band patches. And so what you allege IBM to have done is
a platform problem, not a language one.

Support for IBM extended reals is partial, and will improve as PPC is ported to.

Post by Russel Winder via Digitalmars-d
The D real type definitely suffers the C/C++ float and double problem!
I guess we just hope that all future hardware is IEEE754 compliant.

else
static assert(false, "Here's a nickel, kid. Go buy yourself a real
computer."); // :)

H. S. Teoh via Digitalmars-d

2014-06-29 22:21:23 UTC

Post by Iain Buclaw via Digitalmars-d
On 29 June 2014 22:45, Russel Winder via Digitalmars-d

[...]

Post by Russel Winder via Digitalmars-d
The D real type definitely suffers the C/C++ float and double problem!
I guess we just hope that all future hardware is IEEE754 compliant.

else
static assert(false, "Here's a nickel, kid. Go buy yourself a real
computer."); // :)

+1. :-)

T

--
Debian GNU/Linux: Cray on your desktop.

Walter Bright via Digitalmars-d

2014-06-29 22:49:44 UTC

Post by Russel Winder via Digitalmars-d
So D float and double will not work on IBM 360 unless interpreted,

That's right.

On the other hand, someone could create a "D360" fork of the language that was
specifically targetted to the 360. Nothing wrong with that. Why burden the other
99.999999% of D programmers with 360 nutburger problems?

Post by Russel Winder via Digitalmars-d
I guess we just hope that all future hardware is IEEE754 compliant.

I'm not concerned about it. No CPU maker in their right head would do something
different.

I've witnessed decades of "portable" C code where the programmer tried to be
"portable" in his use of int's and char's, but never tested it on a machine
where those sizes are different, and when finally it was tested it turned out to
be broken.

Meaning that whether the D spec defines 360 portability or not, there's just no
way that FP code is going to be portable to the 360 unless someone actually
tests it.

1's complement, 10 bit bytes, 18 bit words, non-IEEE fp, are all DEAD. I can
pretty much guarantee you that about zero of C/C++ programs will actually work
without modification on those systems, despite the claims of the C/C++ Standard.

I'd also bet you that most C/C++ code will break if ints are 64 bits, and about
99% will break if you try to compile them with a 16 bit C/C++ compiler. 90% will
break if you feed it EBCDIC.

Andrei Alexandrescu via Digitalmars-d

2014-06-28 13:49:14 UTC

*The number of algorithms that are both numerically stable/correct and benefit
significantly from > 64bit doubles is very small.

To be blunt, baloney. I ran into these problems ALL THE TIME when doing
professional numerical work.

Sorry for being so abrupt. FP is important to me - it's not just about
performance, it's also about accuracy.

The only problem is/would be when the language forces one choice over
the other. Both options of maximum performance and maximum precision
should be handily accessible to D users.

Andrei

Walter Bright via Digitalmars-d

2014-06-29 00:42:50 UTC

The only problem is/would be when the language forces one choice over the other.
Both options of maximum performance and maximum precision should be handily
accessible to D users.

That's a much more reasonable position than "we should abandon 80 bit reals".

Tofu Ninja via Digitalmars-d

2014-06-29 01:02:35 UTC

I think this thread is getting out of hand. The main point was to
get float and double overloads for std.math.

This whole discussion about numeric stability, the naming of
double and float, the state of real.... all of it is a little bit
ridiculous.

Numerical stability is not really related to getting faster
overloads other than the obvious fact that it is a trade off.
Float and double do not need a name change. Real also does not
need a change.

I think this thread needs to refocus on the main point, getting
math overloads for float and double and how to mitigate any
problems that might arise from that.

Andrei Alexandrescu via Digitalmars-d

2014-06-29 03:41:24 UTC

Post by Tofu Ninja via Digitalmars-d
I think this thread is getting out of hand. The main point was to
get float and double overloads for std.math.
This whole discussion about numeric stability, the naming of
double and float, the state of real.... all of it is a little bit
ridiculous.
Numerical stability is not really related to getting faster
overloads other than the obvious fact that it is a trade off.
Float and double do not need a name change. Real also does not
need a change.
I think this thread needs to refocus on the main point, getting
math overloads for float and double and how to mitigate any
problems that might arise from that.

Yes please. -- Andrei

H. S. Teoh via Digitalmars-d

2014-06-29 04:46:36 UTC

[...]

Post by Tofu Ninja via Digitalmars-d
I think this thread needs to refocus on the main point, getting
math overloads for float and double and how to mitigate any
problems that might arise from that.

Yes please. -- Andrei

Let's see the PR!

And while we're on the topic, what about working on making std.math
CTFE-able? So far, CTFE simply doesn't support fundamental
floating-point operations like isInfinity, isNaN, signbit, to name a
few, because CTFE does not allow accessing the bit representation of
floating-point values. This is a big disappointment for me -- it defeats
the power of CTFE by making it unusable if you want to use it to
generate pre-calculated tables of values.

Perhaps we can introduce some intrinsics for implementing these
functions so that they work both in CTFE and at runtime?

https://issues.dlang.org/show_bug.cgi?id=3749

Thanks to Iain's hard work on std.math, now we have software
implementations for all(?) the basic math functions, so in theory they
should be CTFE-able -- except that some functions require access to the
floating-point bit representation, which CTFE doesn't support. All it
takes is to these primitives, and std.math will be completely CTFE-able
-- a big step forward IMHO.

T

--
Talk is cheap. Whining is actually free. -- Lars Wirzenius

Iain Buclaw via Digitalmars-d

2014-06-29 07:54:49 UTC

On 29 Jun 2014 05:48, "H. S. Teoh via Digitalmars-d" <

Post by Tofu Ninja via Digitalmars-d
I think this thread needs to refocus on the main point, getting
math overloads for float and double and how to mitigate any
problems that might arise from that.

Yes please. -- Andrei

Let's see the PR!

I've already raised one (already linked in this thread).

More to come!

Post by H. S. Teoh via Digitalmars-d
And while we're on the topic, what about working on making std.math
CTFE-able? So far, CTFE simply doesn't support fundamental
floating-point operations like isInfinity, isNaN, signbit, to name a
few, because CTFE does not allow accessing the bit representation of
floating-point values.

As it stands, as soon as the above mentioned PR for Phobos is merged, isNaN
and isInfinite on float and double types will be CTFE-able. However that
depends on whether or not float->int painting will be replaced with a union.

Post by H. S. Teoh via Digitalmars-d
This is a big disappointment for me -- it defeats
the power of CTFE by making it unusable if you want to use it to
generate pre-calculated tables of values.
Perhaps we can introduce some intrinsics for implementing these
functions so that they work both in CTFE and at runtime?
https://issues.dlang.org/show_bug.cgi?id=3749

CTFE support for accessing basic types in unions - as in painting between
all kinds of scalar types, with special support for static arrays (via
vectors) should be all that is required.

Once CTFE supports that, it won't be difficult to get std.math to be
CTFE-certified. :)

Post by H. S. Teoh via Digitalmars-d
Thanks to Iain's hard work on std.math, now we have software
implementations for all(?) the basic math functions, so in theory they
should be CTFE-able -- except that some functions require access to the
floating-point bit representation, which CTFE doesn't support. All it
takes is to these primitives, and std.math will be completely CTFE-able
-- a big step forward IMHO.

The original goal was making std.math non-asm implementations *genuinely*
pure/nothrow/@safe for GDC x86, and for other ports like ARM, SPARC so LDC
benefits also.

Andrei was the one who sold me on the idea if making them CTFE-able.
However, I stopped just short of that goal because of this missing feature
of DMD - though I did implement it in GDC as proof of concept that it is
possible (code not actually published anywhere)

There should be a bug report somewhere that I outlined the exact steps in.

Regards
Iain.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20140629/68219bc2/attachment.html>

H. S. Teoh via Digitalmars-d

2014-06-29 22:20:23 UTC

Post by Iain Buclaw via Digitalmars-d
On 29 Jun 2014 05:48, "H. S. Teoh via Digitalmars-d" <

On Sat, Jun 28, 2014 at 08:41:24PM -0700, Andrei Alexandrescu via
[...]

Post by Tofu Ninja via Digitalmars-d
I think this thread needs to refocus on the main point, getting
math overloads for float and double and how to mitigate any
problems that might arise from that.

Yes please. -- Andrei

Let's see the PR!

I've already raised one (already linked in this thread).

Are you talking about #2274? Interesting that your implementation is
basically identical to my own idea for fixing std.math -- using unions
instead of pointer casting. However, without compiler support for
repainting scalars in a union, I couldn't get what I needed to work. I'm
trying to make atan2 CTFE-able, but it in turn uses isInfinity, isNaN,
signbit, and possibly one or two others, and while I managed to hack
isInfinity and isNaN, signbit defeated me due to the signedness of NaNs,
which cannot be extracted in any other way.

Post by Iain Buclaw via Digitalmars-d
More to come!

Are you going to implement repainting unions in CTFE? That would be
*awesome*.

[...]

Perhaps we can introduce some intrinsics for implementing these
functions so that they work both in CTFE and at runtime?
https://issues.dlang.org/show_bug.cgi?id=3749

CTFE support for accessing basic types in unions - as in painting
between all kinds of scalar types, with special support for static
arrays (via vectors) should be all that is required.

I thought as much. Currently, unions don't support repainting in CTFE,
so it doesn't work. I think that's the last hurdle needed, since with
repainting everything else can be done in code.

Post by Iain Buclaw via Digitalmars-d
Once CTFE supports that, it won't be difficult to get std.math to be
CTFE-certified. :)

[...]

Looking forward to that!

T

--
Why ask rhetorical questions? -- JC

Iain Buclaw via Digitalmars-d

2014-06-29 22:33:20 UTC

Post by H. S. Teoh via Digitalmars-d

On 29 June 2014 23:20, H. S. Teoh via Digitalmars-d

Post by Iain Buclaw via Digitalmars-d
On 29 Jun 2014 05:48, "H. S. Teoh via Digitalmars-d" <

On Sat, Jun 28, 2014 at 08:41:24PM -0700, Andrei Alexandrescu via
[...]

Post by Tofu Ninja via Digitalmars-d
I think this thread needs to refocus on the main point, getting
math overloads for float and double and how to mitigate any
problems that might arise from that.

Yes please. -- Andrei

Let's see the PR!

I've already raised one (already linked in this thread).

Are you talking about #2274? Interesting that your implementation is
basically identical to my own idea for fixing std.math -- using unions
instead of pointer casting.

Not really. The biggest speed up was from adding float+double
overloads for floor, ceil, isNaN and isInfinity. Firstly, the use of
a union itself didn't make much of a dent in the speed up. Removing
the slow array copy operation did though. Secondly, unions are
required for this particular function (floor) because we need to set
bits through type-punning, it just wouldn't work casting to a pointer.

Regards
Iain

Timon Gehr via Digitalmars-d

2014-06-29 01:18:13 UTC

The only problem is/would be when the language forces one choice over the other.
Both options of maximum performance and maximum precision should be handily
accessible to D users.

That's a much more reasonable position than "we should abandon 80 bit reals".

If that is what you were arguing against, I don't think this was
actually suggested.

Andrei Alexandrescu via Digitalmars-d

2014-06-29 03:40:58 UTC

The only problem is/would be when the language forces one choice over the other.
Both options of maximum performance and maximum precision should be handily
accessible to D users.

That's a much more reasonable position than "we should abandon 80 bit reals".

Awesome! -- Andrei

Manu via Digitalmars-d

2014-06-30 03:22:15 UTC

On 28 June 2014 16:16, Walter Bright via Digitalmars-d

*The number of algorithms that are both numerically stable/correct and benefit
significantly from > 64bit doubles is very small.

To be blunt, baloney. I ran into these problems ALL THE TIME when doing
professional numerical work.

Sorry for being so abrupt. FP is important to me - it's not just about
performance, it's also about accuracy.

Well, here's the thing then. Consider that 'real' is only actually
supported on only a single (long deprecated!) architecture.

I think it's reasonable to see that 'real' is not actually an fp type.
It's more like an auxiliary type, which just happens to be supported
via a completely different (legacy) set of registers on x64 (most
arch's don't support it at all).
In x64's case, it is deprecated for over a decade now, and may be
removed from the hardware at some unknown time. The moment that x64
processors decide to stop supporting 32bit code, the x87 will go away,
and those opcodes will likely be emulated or microcoded.
Interacting real<->float/double means register swapping through
memory. It should be treated the same as float<->simd; they are
distinct (on most arch's).

For my money, x87 can only be considered, at best, a coprocessor (a
slow one!), which may or may not exist. Software written today (10+
years after the hardware was deprecated) should probably even consider
introducing runtime checks to see if the hardware is even present
before making use of it.

It's fine to offer a great precise extended precision library, but I
don't think it can be _the_ standard math library which is used by
everyone in virtually all applications. It's not a defined part of the
architecture, it's slow, and it will probably go away in the future.

It's the same situation with SIMD; on x64, the SIMD unit and the FPU
are the same unit, but I don't think it's reasonable to design all the
API's around that assumption. Most processors separate the SIMD unit
from the FPU, and the language decisions reflect that. We can't make
the language treat SIMD just like an FPU extensions on account of just
one single architecture... although in that case, the argument would
be even more compelling since x64 is actually current and active.

Walter Bright via Digitalmars-d

2014-06-30 04:15:40 UTC

Post by Manu via Digitalmars-d
Well, here's the thing then. Consider that 'real' is only actually
supported on only a single (long deprecated!) architecture.

It's news to me that x86, x86-64, etc., are deprecated, despite being used to
run pretty much all desktops and laptops and even servers. The 80 bit reals are
also part of the C ABI for Linux, OSX, and FreeBSD, 32 and 64 bit.

Post by Manu via Digitalmars-d
I think it's reasonable to see that 'real' is not actually an fp type.

I find that a bizarre statement.

Post by Manu via Digitalmars-d
It's more like an auxiliary type, which just happens to be supported
via a completely different (legacy) set of registers on x64 (most
arch's don't support it at all).

The SIMD registers are also a "completely different set of registers".

Post by Manu via Digitalmars-d
In x64's case, it is deprecated for over a decade now, and may be
removed from the hardware at some unknown time. The moment that x64
processors decide to stop supporting 32bit code, the x87 will go away,
and those opcodes will likely be emulated or microcoded.
Interacting real<->float/double means register swapping through
memory. It should be treated the same as float<->simd; they are
distinct (on most arch's).

Since they are part of the 64 bit C ABI, that would seem to be in the category
of "nevah hoppen".

Post by Manu via Digitalmars-d
It's the same situation with SIMD; on x64, the SIMD unit and the FPU
are the same unit, but I don't think it's reasonable to design all the
API's around that assumption. Most processors separate the SIMD unit
from the FPU, and the language decisions reflect that. We can't make
the language treat SIMD just like an FPU extensions on account of just
one single architecture... although in that case, the argument would
be even more compelling since x64 is actually current and active.

Intel has yet to remove any SIMD instructions.

Manu via Digitalmars-d

2014-06-30 04:38:13 UTC

On 30 June 2014 14:15, Walter Bright via Digitalmars-d

Post by Manu via Digitalmars-d
Well, here's the thing then. Consider that 'real' is only actually
supported on only a single (long deprecated!) architecture.

It's news to me that x86, x86-64, etc., are deprecated, despite being used
to run pretty much all desktops and laptops and even servers. The 80 bit
reals are also part of the C ABI for Linux, OSX, and FreeBSD, 32 and 64 bit.

x86_64 and x86 are different architectures, and they have very different ABI's.
Nobody is manufacturing x86 (exclusive) cpu's.
Current x86_64 cpu's maintain a backwards compatibility mode, but
that's not a part of the x86-64 spec, and may go away when x86_64 is
deemed sufficiently pervasive and x86 sufficiently redundant.

Post by Manu via Digitalmars-d
I think it's reasonable to see that 'real' is not actually an fp type.

I find that a bizarre statement.

Well, it's not an fp type as implemented by the standard fp
architecture of any cpu except x86, which is becoming less relevant
with each passing day.

The SIMD registers are also a "completely different set of registers".

Correct, so they are deliberately treated separately.
I argued for strong separation between simd and float, and you agreed.

Since they are part of the 64 bit C ABI, that would seem to be in the
category of "nevah hoppen".

Not in windows. You say they are in linux? I don't know.

"Intel started discouraging the use of x87 with the introduction of
the P4 in late 2000. AMD deprecated x87 since the K8 in 2003, as
x86-64 is defined with SSE2 support; VIAâs C7 has supported SSE2 since
2005. In 64-bit versions of Windows, x87 is deprecated for user-mode,
and prohibited entirely in kernel-mode."

How do you distinguish x87 double and xmm double in C? The only way I
know to access x87 is with inline asm.

Intel has yet to remove any SIMD instructions.

Huh? I think you misunderstood my point. I'm saying that fpu/simd
units are distinct, and they are distanced by the type system in order
to respect that separation.

Kagamin via Digitalmars-d

2014-06-27 18:47:33 UTC

I think, make real==double on x86-64, like on other
architectures, because double is the way to go.

Walter Bright via Digitalmars-d

2014-06-28 05:28:11 UTC

I think, make real==double on x86-64, like on other architectures, because
double is the way to go.

No.

Consider also that on non-Windows platforms, such a decision would shut D out
from accessing C code written using long doubles.

BTW, there's a reason Fortran is still king for numerical work - that's because
C compiler devs typically do not understand floating point math and provide
crappy imprecise math functions. I had an argument with a physics computation
prof a year back who was gobsmacked when I told him the FreeBSD 80 bit math
functions were only accurate to 64 bits. He told me he didn't believe me, that C
wouldn't make such mistakes. I suggested he test it and see for himself :-)

They can and do. The history of C, including the C Standard, shows a lack of
knowledge of how to do numerical math. For example, it was years and years
before the Standard mentioned what the math functions should do with infinity
arguments.

Things have gotten better in recent years, but I'd always intended that D out of
the gate have proper support for fp, including fully accurate math functions.
The reason D re-implements the math functions in Phobos rather than deferring to
the C ones is the unreliability of the C ones.

Walter Bright via Digitalmars-d

2014-06-28 05:16:30 UTC

Post by Manu via Digitalmars-d
Totally agree.
Maintaining commitment to deprecated hardware which could be removed
from the silicone at any time is a bit of a problem looking forwards.
Regardless of the decision about whether overloads are created, at
very least, I'd suggest x64 should define real as double, since the
x87 is deprecated, and x64 ABI uses the SSE unit. It makes no sense at
all to use real under any general circumstances in x64 builds.
And aside from that, if you *think* you need real for precision, the
truth is, you probably have bigger problems.
Double already has massive precision. I find it's extremely rare to
have precision problems even with float under most normal usage
circumstances, assuming you are conscious of the relative magnitudes
of your terms.

That's a common perception of people who do not use the floating point unit for
numerical work, and whose main concern is speed instead of accuracy.

I've done numerical floating point work. Two common cases where such precision
matters:

1. numerical integration
2. inverting matrices

It's amazing how quickly precision gets overwhelmed and you get garbage answers.
For example, when inverting a matrix with doubles, the results are garbage for
larger than 14*14 matrices or so. There are techniques for dealing with this,
but they are complex and difficult to implement.

Increasing the precision is the most straightforward way to deal with it.

Note that the 80 bit precision comes from W.F. Kahan, and he's no fool when
dealing with these issues.

Another boring Boeing anecdote: calculators have around 10 digits of precision.
A colleague of mine was doing a multi-step calculation, and rounded each step to
2 decimal points. I told him he needed to keep the full 10 digits. He ridiculed
me - but his final answer was off by a factor of 2. He could not understand why,
and I'd explain, but he could never get how his 2 places past the decimal point
did not work.

Do you think engineers like that will ever understand the problems with double
precision, or have the remotest idea how to deal with them beyond increasing the
precision? I don't.

Post by Manu via Digitalmars-d
I find it's extremely rare to have precision problems even with float under

most normal usage

Post by Manu via Digitalmars-d
circumstances,

Then you aren't doing numerical work, because it happens right away.

Kapps via Digitalmars-d

2014-06-29 05:58:49 UTC

Post by Walter Bright via Digitalmars-d
That's a common perception of people who do not use the
floating point unit for numerical work, and whose main concern
is speed instead of accuracy.
<snip>

Post by Manu via Digitalmars-d
I find it's extremely rare to have precision problems even

with float under most normal usage

Post by Manu via Digitalmars-d
circumstances,

Then you aren't doing numerical work, because it happens right
away.

There is of course many situations where a high precision is
necessary. But in these situations you have 'real' available to
you, which presumably would maintain as high precision as is
possible. In the situations where you're using float/double, you
should not be expecting maximum precision and instead performance
should be focused on. There are existing overloads for 'real' in
std.math that presumably would not go away, the only need now is
to add new overloads for float/double that can take advantage of
SSE instructions. While striving for precision is nice, in a huge
majority of situations it's simply not necessary (and when it is,
'real' will be used). It makes D look bad when it does so poorly
on benchmarks like this simply so that the output perlin noise
can be rounded from 238.32412319 to 238 instead of 238.32 to 238.

Manu via Digitalmars-d

2014-06-30 02:58:05 UTC

On 28 June 2014 15:16, Walter Bright via Digitalmars-d

This is what I was alluding to wrt being aware of the relative
magnitudes of terms in operations.
You're right it can be a little complex, but it's usually just a case
of rearranging the operations a bit, or worst case, a temporary
renormalisation from time to time.

Post by Walter Bright via Digitalmars-d
Increasing the precision is the most straightforward way to deal with it.

Is a 14*14 matrix really any more common than a 16*16 matrix though?
It just moves the goal post a bit. Numerical integration will always
manage to find it's way into crazy big or crazy small numbers. It's
all about relative magnitude with floats.
'real' is only good for about 4 more significant digits... I've often
thought they went a bit overboard on exponent and skimped on mantissa.
Surely most users would reach for a lib in these cases anyway, and
they would be written by an expert.

Either way, I don't think it's sensible to have a std api defy the arch ABI.

Post by Walter Bright via Digitalmars-d
Note that the 80 bit precision comes from W.F. Kahan, and he's no fool when
dealing with these issues.

I never argued this. I'm just saying I can't see how defying the ABI
in a std api could be seen as a good idea applied generally to all
software.

Post by Walter Bright via Digitalmars-d
Another boring Boeing anecdote: calculators have around 10 digits of
precision. A colleague of mine was doing a multi-step calculation, and
rounded each step to 2 decimal points. I told him he needed to keep the full
10 digits. He ridiculed me - but his final answer was off by a factor of 2.
He could not understand why, and I'd explain, but he could never get how his
2 places past the decimal point did not work.

Rounding down to 2 decimal points is rather different than rounding
from 19 to 15 decimal points.

Post by Walter Bright via Digitalmars-d
Do you think engineers like that will ever understand the problems with
double precision, or have the remotest idea how to deal with them beyond
increasing the precision? I don't.

I think they would use a library.
Either way, those jobs are so rare, I don't see that it's worth
defying the arch ABI across the board for it.

I think there should be a 'double' overload. The existing real
overload would be chosen when people use the real type explicitly.
Another advantage of this, is that when people are using the double
type, the API will produce the same results on all architectures,
including the ones that don't have 'real'.