%a format in tests-ulc*.c

Discussion:

Gisle Vanem

2017-04-19 12:13:26 UTC

When using MSVC-2015 to build the tests/unistdio/test-ulc-*.c files,
I get ASSERT() on all the '%a' formats. E.g. in unistdio/test-ulc-vasnprintf1.exe
and unistdio/test-ulc-printf1.h (line 195):

char *result =
my_xasprintf ("%a %d", 3.1416015625, 33, 44, 55);
ASSERT (result != NULL);
ASSERT (strcmp (result, "0x1.922p+1 33") == 0
|| strcmp (result, "0x3.244p+0 33") == 0
|| strcmp (result, "0x6.488p-1 33") == 0
|| strcmp (result, "0xc.91p-2 33") == 0);

The 'result' I get is '0x1.9220000000000p+1 33'.

The MSDN at:
https://msdn.microsoft.com/en-us/library/hf4y5e3w.aspx

isn't clear on how many digits there should be, but I guess the precision
is a reflection the double float-type.

With "%.3a %d", I do get the expected "0x1.922p+1 33".
So are these tests somewhat gcc-centric or what?

--
--gv

Paul Eggert

2017-04-20 00:36:57 UTC

Permalink

Post by Gisle Vanem
With "%.3a %d", I do get the expected "0x1.922p+1 33".
So are these tests somewhat gcc-centric or what?

Yes. It looks to me like MSVC-2015 is right and glibc is wrong, at least
in the sense of acting like standard printf.

Gisle Vanem

2017-04-20 11:40:49 UTC

Permalink

Post by Gisle Vanem
With "%.3a %d", I do get the expected "0x1.922p+1 33".
So are these tests somewhat gcc-centric or what?

Yes. It looks to me like MSVC-2015 is right and glibc is wrong, at least in the sense of acting like standard printf.

It seems strange there should be a difference in output since
MSVC internal *printf() seems not involved. Does this perhaps has
something to do with 'BEGIN_LONG_DOUBLE_ROUNDING()' used in
printf-frexp.c?

This macro is effective for gcc only using inline assembly.
It could be done using '_controlfp()' (or '_controlfp_s()')
for MSVC too AFAICS.

--
--gv

Bruno Haible

2017-04-21 22:58:59 UTC

Permalink

Hi Paul,

Post by Paul Eggert

Post by Gisle Vanem
With "%.3a %d", I do get the expected "0x1.922p+1 33".
So are these tests somewhat gcc-centric or what?

Yes. It looks to me like MSVC-2015 is right and glibc is wrong, at least
in the sense of acting like standard printf.

I agree that MSVC 14 is right, quoting [1]
"a, A ... if the precision is missing and FLT_RADIX is a power of 2,
then the precision shall be sufficient for an exact representation
of the value"
This sentence gives an implementation the freedom to append as many zeroes as
it wants.

But the number we print is 3.1416015625 = 3217 / 2^10; therefore no rounding
is involved. Why would you consider the expected result "0x1.922p+1" wrong?

Bruno

[1] http://pubs.opengroup.org/onlinepubs/9699919799/functions/printf.html

Paul Eggert

2017-04-22 09:28:38 UTC

Permalink

Post by Bruno Haible
Why would you consider the expected result "0x1.922p+1" wrong?

The POSIX spec says "if the precision is missing and FLT_RADIX is a power of 2,
then the precision shall be sufficient for an exact representation of the value;
if the precision is missing and FLT_RADIX is not a power of 2, then the
precision shall be sufficient to distinguish values of type double, except that
trailing zeros may be omitted". Since the spec goes to the trouble of saying
that trailing zeros may be omitted when FLT_RADIX is not a power of 2, and does
not go to this trouble when FLT_RADIX is a power of 2, I inferred that when
FLT_RADIX is a power of 2, trailing zeros cannot be omitted.

Although your interpretation is also plausible, if it is correct then I am
puzzled why the "trailing zeros may be omitted" wording is present. Is there
some subtle distinction between "sufficient for an exact representation of the
value" and "sufficient to distinguish values" that I am not getting?

Bruno Haible

2017-04-22 11:11:51 UTC

Permalink

Post by Paul Eggert

Post by Bruno Haible
Why would you consider the expected result "0x1.922p+1" wrong?

The POSIX spec says "if the precision is missing and FLT_RADIX is a power of 2,
then the precision shall be sufficient for an exact representation of the value;
if the precision is missing and FLT_RADIX is not a power of 2, then the
precision shall be sufficient to distinguish values of type double, except that
trailing zeros may be omitted". Since the spec goes to the trouble of saying
that trailing zeros may be omitted when FLT_RADIX is not a power of 2, and does
not go to this trouble when FLT_RADIX is a power of 2, I inferred that when
FLT_RADIX is a power of 2, trailing zeros cannot be omitted.
Although your interpretation is also plausible, if it is correct then I am
puzzled why the "trailing zeros may be omitted" wording is present.

Two different algorithms are involved. The first case is simpler and needs less
wording. The second case involves considering the sequence of consecutive
floating-point numbers of the given format. For example, in IEEE double-float
representation and FLT_RADIX=3, the sequence is

(DECIMAL) (FLT_RADIX=3)
3.1416015625 1.001021102001021212212021121202001202200212000111112012201111...p+1
3.1416015625000004 1.001021102001021212212021121202002120220202021122002022121111...p+1
3.141601562500001 1.001021102001021212212021121202010102010122112202122102111112...p+1
3.1416015625000013 1.001021102001021212212021121202011020100112210220012112101112...p+1
3.1416015625000018 1.001021102001021212212021121202012001120110002000202122021120...p+1
3.141601562500002 1.001021102001021212212021121202012212210100100011022202011120...p+1
3.1416015625000027 1.001021102001021212212021121202020201000020121021212212001121...p+1
3.141601562500003 1.001021102001021212212021121202021112020010212102102221221122...p+1
3.1416015625000036 1.001021102001021212212021121202022100110001010120000001211122...p+1
3.141601562500004 1.001021102001021212212021121202100011122221101200120011201200...p+1
3.1416015625000044 1.001021102001021212212021121202100222212211122211010021121200...p+1
3.141601562500005 1.001021102001021212212021121202101211002201220221200101111201...p+1
3.1416015625000053 1.001021102001021212212021121202102122022122012002020111101201...p+1
3.1416015625000058 1.001021102001021212212021121202110110112112110012210121021202...p+1
3.141601562500006 1.001021102001021212212021121202111021202102201100100201011210...p+1
3.1416015625000067 1.001021102001021212212021121202112002222022222110220211001210...p+1
3.141601562500007 1.001021102001021212212021121202112221012020020121110220221211...p+1
3.1416015625000075 1.001021102001021212212021121202120202102010111202001000211211...p+1
3.141601562500008 1.001021102001021212212021121202121120122000202212121010201212...p+1
3.1416015625000084 1.001021102001021212212021121202122101211221001000011020121212...p+1

With just "the precision shall be sufficient to distinguish values of type double"
the spec could be interpreted as if the result must always have the same number
of digits:

(DECIMAL) (FLT_RADIX=3)
3.1416015625 1.0010211020010212122120211212020012p+1
3.1416015625000004 1.0010211020010212122120211212020022p+1
3.141601562500001 1.0010211020010212122120211212020101p+1
3.1416015625000013 1.0010211020010212122120211212020111p+1
3.1416015625000018 1.0010211020010212122120211212020120p+1
3.141601562500002 1.0010211020010212122120211212020200p+1
3.1416015625000027 1.0010211020010212122120211212020202p+1
3.141601562500003 1.0010211020010212122120211212020212p+1
3.1416015625000036 1.0010211020010212122120211212020221p+1
3.141601562500004 1.0010211020010212122120211212021001p+1
3.1416015625000044 1.0010211020010212122120211212021010p+1
3.141601562500005 1.0010211020010212122120211212021012p+1
3.1416015625000053 1.0010211020010212122120211212021022p+1
3.1416015625000058 1.0010211020010212122120211212021101p+1
3.141601562500006 1.0010211020010212122120211212021111p+1
3.1416015625000067 1.0010211020010212122120211212021120p+1
3.141601562500007 1.0010211020010212122120211212021200p+1
3.1416015625000075 1.0010211020010212122120211212021202p+1
3.141601562500008 1.0010211020010212122120211212021212p+1
3.1416015625000084 1.0010211020010212122120211212021221p+1

The additional sentence "except that trailing zeros may be omitted" means
that it is OK to produce this:

(DECIMAL) (FLT_RADIX=3)
3.1416015625 1.0010211020010212122120211212020012p+1
3.1416015625000004 1.0010211020010212122120211212020022p+1
3.141601562500001 1.0010211020010212122120211212020101p+1
3.1416015625000013 1.0010211020010212122120211212020111p+1
3.1416015625000018 1.001021102001021212212021121202012p+1
3.141601562500002 1.00102110200102121221202112120202p+1
3.1416015625000027 1.0010211020010212122120211212020202p+1
3.141601562500003 1.0010211020010212122120211212020212p+1
3.1416015625000036 1.0010211020010212122120211212020221p+1
3.141601562500004 1.0010211020010212122120211212021001p+1
3.1416015625000044 1.001021102001021212212021121202101p+1
3.141601562500005 1.0010211020010212122120211212021012p+1
3.1416015625000053 1.0010211020010212122120211212021022p+1
3.1416015625000058 1.0010211020010212122120211212021101p+1
3.141601562500006 1.0010211020010212122120211212021111p+1
3.1416015625000067 1.001021102001021212212021121202112p+1
3.141601562500007 1.00102110200102121221202112120212p+1
3.1416015625000075 1.0010211020010212122120211212021202p+1
3.141601562500008 1.0010211020010212122120211212021212p+1
3.1416015625000084 1.0010211020010212122120211212021221p+1

The algorithm for the second case also applies to the first case, but
no engineer with a sane mind would apply this algorithm to the first case.
Instead every normal programmer will use the direct conversion algorithm
(which does not consider sequences) for the first case.

Post by Paul Eggert
Is there
some subtle distinction between "sufficient for an exact representation of the
value" and "sufficient to distinguish values" that I am not getting?

The first sentence applies to a single value; "sufficient" implies that you can
omit trailing zeroes.

The second sentence can be interpreted as if "sufficient" means a number of digits
that is independent of the value.

Bruno

Paul Eggert

2017-04-22 21:17:44 UTC

Permalink

Post by Bruno Haible
The algorithm for the second case also applies to the first case, but
no engineer with a sane mind would apply this algorithm to the first case.

Ah, that explains it! I was crazy, because I indeed thought of just that one
interpretation, and applied it to both cases.

Perhaps this is because I am a fan of shorter, more-intuitive numbers. You can
blame me for the fact that in Emacs the double-precision floating-point number
closest to 0.1 displays as "0.1" rather than as the more-precise but uglier
"0.10000000000000001".

Bruno Haible

2017-04-23 01:09:22 UTC

Permalink

Hi Paul,

Post by Paul Eggert
Perhaps this is because I am a fan of shorter, more-intuitive numbers. You can
blame me for the fact that in Emacs the double-precision floating-point number
closest to 0.1 displays as "0.1" rather than as the more-precise but uglier
"0.10000000000000001".

Likewise, in Lisp culture, this kind of shorter external representation of
floating-point is used:
- In Common Lisp, it is specified by the standard [1]. The goal to use as few
digits as possible is implicit.
- Likewise, in Scheme, it is specified by the standard [2].
- Some schemers even thought it was worthwhile to write a paper about their
implementation of this specification. [3].

Bruno

[1] https://www.cs.cmu.edu/Groups/AI/util/html/cltl/clm/node187.html
"reading a printed representation produces an object that is ... equal to
the originally printed object."
[2] http://www.schemers.org/Documents/Standards/R5RS/HTML/r5rs-Z-H-9.html#%_sec_6.2.6
"... is expressed using the minimum number of digits ..."
[3] http://www.cs.indiana.edu/~dyb/pubs/FP-Printing-PLDI96-abstract.html

Paul Eggert

2017-04-23 01:38:45 UTC

Permalink

Post by Bruno Haible
- Some schemers even thought it was worthwhile to write a paper about their
implementation of this specification. [3].

Yes, Gnulib addresses this problem in the ftoastr module, using a simpler but
presumably less-efficient approach. As it happens, an improved algorithm was
published by the Lerner group in POPL'16, so I installed the attached.

Bruno Haible

2017-04-21 23:02:44 UTC

Permalink

Hi Gisle,

Post by Gisle Vanem
When using MSVC-2015 to build the tests/unistdio/test-ulc-*.c files,
I get ASSERT() on all the '%a' formats. E.g. in unistdio/test-ulc-vasnprintf1.exe
char *result =
my_xasprintf ("%a %d", 3.1416015625, 33, 44, 55);
ASSERT (result != NULL);
ASSERT (strcmp (result, "0x1.922p+1 33") == 0
|| strcmp (result, "0x3.244p+0 33") == 0
|| strcmp (result, "0x6.488p-1 33") == 0
|| strcmp (result, "0xc.91p-2 33") == 0);
The 'result' I get is '0x1.9220000000000p+1 33'.

Can you provide a patch? To do so, create a testdir

./gnulib-tool --create-testdir --dir=../testdir-ulc1 --single-configure \
unistdio/ulc-fprintf unistdio/ulc-asprintf unistdio/ulc-snprintf \
unistdio/ulc-vasnprintf unistdio/ulc-vfprintf

Build it, run "make check", and add '|| strcmp ...' alternatives until the
tests pass.

Can you do this please? I can't, since for me (with MSVC 14 = Visual Studio 2015
on Windows 10), the tests pass.

Bruno