Edward Catmur
2017-04-10 09:52:41 UTC
Firstly, I want to say that I think to_chars and from_chars are a great
addition to the Standard and I look forward to using them in C++17. I have
a few questions regarding their behavior on floating point types.
(As background for the first few questions: for each floating-point type
there are a (relatively) small number of large integers that are exactly
halfway between two adjacent values of that type, and which have a
relatively short scientific decimal representation. For example, 1e23 has
hexadecimal floating-point representation 0x1.52d02c7e14af68p76, which is
exactly halfway between the adjacent IEEE 754 (64-bit) double values
0x1.52d02c7e14af6p76 and 0x1.52d02c7e14af7p76. Parsing the string "1e23"
into double using from_chars [utility.from.chars] is required to produce
one of those two values.)
Firstly, is from_chars expected to have idempotent behavior, or is it
allowed to be dependent on e.g. floating-point environment or the use of
80-bit floating point (32-bit Linux on x86)?
Secondly, is from_chars expected or encouraged to have the same behavior as
the compiler? i.e. for double d; auto s = "1e23" should we expect 1e23 ==
(from_chars(s, s + 4, d), d) ever to fail?
Most importantly, is to_chars permitted to produce an overlong output where
the shorter output round-trips on the same implementation but is not
guaranteed to do so globally? For example, if an implementation always
reads "1e23" as 0x1.52d02c7e14af6p76, is it permitted to output
0x1.52d02c7e14af6p76 as "9.999999999999999e22" on the basis that this is
guaranteed to be read correctly by a different implementation that might
read "1e23" as 0x1.52d02c7e14af7p76?
In addition, I would be interested in knowing whether the following
underspecification is intentional:
Is the result of to_chars() required to represent the closest to the input
value among strings of that length that round-trip? For example
0x1.0000000000001p0 is approx. 1.000000000000000222045, so is
1.0000000000000003 an acceptable output from to_chars, or only
1.0000000000000002? Or consider the smallest positive subnormal IEEE
double, 0x1p-1074, approx. 4.94e-324 - is 4e-324 an acceptable output, or
only 5e-324? (In Florian Loitsch [1], this is the "closeness" property of
Grisu3.)
I hope the above questions don't come across as overly pedantic; I would be
perfectly satisfied to be told that all of the above are QOI matters, but
I'd hope to know what to expect before retiring our current code using
Google double-conversion[2].
Finally, it would be useful to know the minimum buffer size necessary to
guarantee successful conversion in all cases. I would guess this is
something like 4 + numeric_limits<T>::max_digits10 +
max(log10(numeric_limits<T>::max_exponent10), 1 +
log10(-numeric_limits<T>::min_exponent10)) but it would be useful to have
confirmation of this calculation or indeed to have it available in the
Standard as a constant.
Thanks!
1. http://www.cs.tufts.edu/~nr/cs257/archive/florian-loitsch/printf.pdf
2. https://github.com/google/double-conversion
addition to the Standard and I look forward to using them in C++17. I have
a few questions regarding their behavior on floating point types.
(As background for the first few questions: for each floating-point type
there are a (relatively) small number of large integers that are exactly
halfway between two adjacent values of that type, and which have a
relatively short scientific decimal representation. For example, 1e23 has
hexadecimal floating-point representation 0x1.52d02c7e14af68p76, which is
exactly halfway between the adjacent IEEE 754 (64-bit) double values
0x1.52d02c7e14af6p76 and 0x1.52d02c7e14af7p76. Parsing the string "1e23"
into double using from_chars [utility.from.chars] is required to produce
one of those two values.)
Firstly, is from_chars expected to have idempotent behavior, or is it
allowed to be dependent on e.g. floating-point environment or the use of
80-bit floating point (32-bit Linux on x86)?
Secondly, is from_chars expected or encouraged to have the same behavior as
the compiler? i.e. for double d; auto s = "1e23" should we expect 1e23 ==
(from_chars(s, s + 4, d), d) ever to fail?
Most importantly, is to_chars permitted to produce an overlong output where
the shorter output round-trips on the same implementation but is not
guaranteed to do so globally? For example, if an implementation always
reads "1e23" as 0x1.52d02c7e14af6p76, is it permitted to output
0x1.52d02c7e14af6p76 as "9.999999999999999e22" on the basis that this is
guaranteed to be read correctly by a different implementation that might
read "1e23" as 0x1.52d02c7e14af7p76?
In addition, I would be interested in knowing whether the following
underspecification is intentional:
Is the result of to_chars() required to represent the closest to the input
value among strings of that length that round-trip? For example
0x1.0000000000001p0 is approx. 1.000000000000000222045, so is
1.0000000000000003 an acceptable output from to_chars, or only
1.0000000000000002? Or consider the smallest positive subnormal IEEE
double, 0x1p-1074, approx. 4.94e-324 - is 4e-324 an acceptable output, or
only 5e-324? (In Florian Loitsch [1], this is the "closeness" property of
Grisu3.)
I hope the above questions don't come across as overly pedantic; I would be
perfectly satisfied to be told that all of the above are QOI matters, but
I'd hope to know what to expect before retiring our current code using
Google double-conversion[2].
Finally, it would be useful to know the minimum buffer size necessary to
guarantee successful conversion in all cases. I would guess this is
something like 4 + numeric_limits<T>::max_digits10 +
max(log10(numeric_limits<T>::max_exponent10), 1 +
log10(-numeric_limits<T>::min_exponent10)) but it would be useful to have
confirmation of this calculation or indeed to have it available in the
Standard as a constant.
Thanks!
1. http://www.cs.tufts.edu/~nr/cs257/archive/florian-loitsch/printf.pdf
2. https://github.com/google/double-conversion
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.