[std-discussion] Type punning with memcpy

Discussion:

[std-discussion] Type punning with memcpy

Language Lawyer

2018-05-25 03:00:12 UTC

Where exactly it is defined?

Is this http://eel.is/c++draft/cstring.syn#1: Â«The contents and meaning of
the header <cstring> are the same as the C standard library header
<string.h>Â» plus definition from the C standard: Â«The memcpy function
copies n characters from the object pointed to by s2 into the object
pointed to by s1Â» enough to make it defined?

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Jens Maurer

2018-05-25 08:19:30 UTC

Post by Language Lawyer
Where exactly it is defined?

What exactly do you mean?

Could you give some example code, please?

Jens

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Language Lawyer

2018-05-25 11:08:12 UTC

Post by Jens Maurer

Post by Language Lawyer
Where exactly it is defined?

What exactly do you mean?
Could you give some example code, please?

Example code:
float f;
int i = 42;
memcpy(&f, &i, sizeof(f));

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Thiago Macieira

2018-05-25 12:58:04 UTC

Post by Language Lawyer

Post by Jens Maurer

Post by Language Lawyer
Where exactly it is defined?

What exactly do you mean?
Could you give some example code, please?

float f;
int i = 42;
memcpy(&f, &i, sizeof(f));

There's no type punning in this example.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

'Edward Catmur' via ISO C++ Standard - Discussion

2018-05-25 15:37:21 UTC

Post by Thiago Macieira

Post by Language Lawyer

Post by Jens Maurer

Post by Language Lawyer
Where exactly it is defined?

What exactly do you mean?
Could you give some example code, please?

float f;
int i = 42;
memcpy(&f, &i, sizeof(f));

There's no type punning in this example.

It would appear to be type-punning according to Wikipedia's definition, and

Post by Thiago Macieira
the appropriate part of the object representation of the value is

reinterpreted as an object representation in the new type

I think it's reasonably clear that it's distinguishing the effect (type
punning) from the mechanism (there, unions; here, memcpy; elsewhere,
aliasing).

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Thiago Macieira

2018-05-25 15:48:29 UTC

On Friday, 25 May 2018 12:37:21 -03 'Edward Catmur' via ISO C++ Standard -

Post by 'Edward Catmur' via ISO C++ Standard - Discussion

Post by Thiago Macieira

Post by Language Lawyer
float f;
int i = 42;
memcpy(&f, &i, sizeof(f));

There's no type punning in this example.

It would appear to be type-punning according to Wikipedia's definition, and

Post by Thiago Macieira
the appropriate part of the object representation of the value is
reinterpreted as an object representation in the new type

I think it's reasonably clear that it's distinguishing the effect (type
punning) from the mechanism (there, unions; here, memcpy; elsewhere,
aliasing).

There's no problem doing this, so long as you don't try to use the f variable,
as it contains a value not defined by the standard. C says it's reinterpreted
in the new type; C++ says you can't read it at all.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Hyman Rosen

2018-05-25 15:59:30 UTC

Post by Thiago Macieira
There's no problem doing this, so long as you don't try to use the f variable,
as it contains a value not defined by the standard. C says it's reinterpreted
in the new type; C++ says you can't read it at all.

The C and C++ object models are flawed and useful only to optimizationists.
The correct model is that a properly-aligned and accessible area of memory
that contains a valid object representation of an object contains that
object.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Thiago Macieira

2018-05-25 19:39:42 UTC

Post by Hyman Rosen
The C and C++ object models are flawed and useful only to optimizationists.
The correct model is that a properly-aligned and accessible area of memory
that contains a valid object representation of an object contains that
object.

I understand where you're coming from and your argument regarding the C++
memory model. But you're going too far on the C one, because it's exactly what
you usually want.

There is no third option. Either you can read the value or you can't. C says
you can, which is what you usually want. It says that the value contained
there is reinterpreted according to the target type's rules, which is what you
want.

It just doesn't say what the behaviour will be, because it can't: you may have
loaded an invalid value and trying to use it could produce a processor
exception due to operating on a NaN, denormal, or a trap representation. It's
up to you, the programmer, to ensure that the value you copied is one the
target type can consume.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

'Edward Catmur' via ISO C++ Standard - Discussion

2018-05-25 17:01:04 UTC

Post by Thiago Macieira
On Friday, 25 May 2018 12:37:21 -03 'Edward Catmur' via ISO C++ Standard -

Post by 'Edward Catmur' via ISO C++ Standard - Discussion

Post by Thiago Macieira

Post by Language Lawyer
float f;
int i = 42;
memcpy(&f, &i, sizeof(f));

There's no type punning in this example.

It would appear to be type-punning according to Wikipedia's definition,

and

Post by 'Edward Catmur' via ISO C++ Standard - Discussion

Post by Thiago Macieira
the appropriate part of the object representation of the value is
reinterpreted as an object representation in the new type

I think it's reasonably clear that it's distinguishing the effect (type
punning) from the mechanism (there, unions; here, memcpy; elsewhere,
aliasing).

There's no problem doing this, so long as you don't try to use the f variable,
as it contains a value not defined by the standard. C says it's reinterpreted
in the new type; C++ says you can't read it at all.

Agreed. OTOH C++20 may have bit_cast https://wg21.link/P0476 which gives us
access to the C behavior.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Language Lawyer

2018-06-13 21:04:58 UTC

C says it's reinterpreted in the new type;

BTW, where?

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Thiago Macieira

2018-06-14 04:31:49 UTC

Post by Language Lawyer

C says it's reinterpreted in the new type;

BTW, where?

C11 6.5.2.3 Structure and union members, note 95 (attached to paragraph 3):

"95) If the member used to read the contents of a union object is not the same
as the member last used to store a value in the object, the appropriate part
of the object representation of the value is reinterpreted as an object
representation in the new type as described in 6.2.6 (a process sometimes
called ‘‘type punning’’). This might be a trap representation."
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Language Lawyer

2018-06-13 22:17:11 UTC

Post by 'Edward Catmur' via ISO C++ Standard - Discussion
It would appear to be type-punning according to Wikipedia's definition, and

Post by Thiago Macieira
the appropriate part of the object representation of the value is

reinterpreted as an object representation in the new type

BTW, this is from a note about reading from inactive member of a union.

Post by 'Edward Catmur' via ISO C++ Standard - Discussion
If the member used to read the contents of a union object is not the same as the member last used to
store a value in the object, the appropriate part of the object representation of the value is reinterpreted
as an object representation in the new type as described in 6.2.6
When a value is stored in a member of an object of union type, the bytes of the object
representation that do not correspond to that member but do correspond to other members
take unspecified values.

So, here are 2 options:
1. The note lie and reading from inactive member of a union is UB, because 6.2.6 does not *explicitly* allow it.
2. 6.2.6 does not have to *explicitly* allow this, because we can deduce such a behaviour from wording about objects representation (sequence of chars etc.)

If one chooses the second option, then one should admit that memcpy-ing in C++ leads to reinterpretation of bytes copied into an object without explicit wording allowing this, because the wording about object representation is more-or-less the same in C and C++, at least for objects of trivially copyable or standard-layout type.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

'Edward Catmur' via ISO C++ Standard - Discussion

2018-06-14 18:05:26 UTC

Post by Language Lawyer

Post by 'Edward Catmur' via ISO C++ Standard - Discussion
It would appear to be type-punning according to Wikipedia's definition, and
the appropriate part of the object representation of the value is
reinterpreted as an object representation in the new type

BTW, this is from a note about reading from inactive member of a union.
Because notes are not normative by itself, the note refers to "6.2.6

Post by 'Edward Catmur' via ISO C++ Standard - Discussion
If the member used to read the contents of a union object is not the same
as the member last used to
store a value in the object, the appropriate part of the object
representation of the value is reinterpreted
as an object representation in the new type as described in 6.2.6
When a value is stored in a member of an object of union type, the bytes of the object
representation that do not correspond to that member but do correspond to other members
take unspecified values.

1. The note lie and reading from inactive member of a union is UB, because
6.2.6 does not *explicitly* allow it.
2. 6.2.6 does not have to *explicitly* allow this, because we can deduce
such a behaviour from wording about objects representation (sequence of
chars etc.)

I would interpret the note as saying that reading from the inactive member
of a union in C behaves as if that member were active but having the object
representation of the relevant bytes of the union (that is, of its actual
active member with any remaining bytes taking unspecified values), 6.2.6
then describing the cases in which this has a defined result (integral
types without padding bits).

It would seem the linguistic issue is whether "as described in 6.2.6"
qualifies "reinterpreted as an object representation in the new type" or
merely "object representation". Since 6.2.6 does not cover reinterpretation
of object representations, but it does cover object representations, the
preferred reading must be the latter.

This would imply that contrary to the general rule the note is normative;
that without it reading from the inactive member of a union would be UB in
C.

Post by Language Lawyer
If one chooses the second option, then one should admit that memcpy-ing in
C++ leads to reinterpretation of bytes copied into an object without
explicit wording allowing this, because the wording about object
representation is more-or-less the same in C and C++, at least for objects
of trivially copyable or standard-layout type.

I don't think that reasoning holds. The fact that object representations
are equivalent means that reading object representations of equivalent
values will behave similarly. It doesn't imply that altering object
representations by writing bytes is permitted outside the explicitly
enumerated cases. Indeed, if this were the case then [basic.types]/2 and /3
would be redundant. Conversely, although it's difficult to find code in C
explicitly permitting altering object representations by writing bytes,
this is implied by 6.2.6.1p5.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Language Lawyer

2018-05-25 18:06:06 UTC

Post by Thiago Macieira

Post by Language Lawyer

Post by Jens Maurer

Post by Language Lawyer
Where exactly it is defined?

What exactly do you mean?
Could you give some example code, please?

float f;
int i = 42;
memcpy(&f, &i, sizeof(f));

There's no type punning in this example.

Really? Probably I was mislead by some Â«common "knowledge"Â» that Â«in C you
can do type punning through union or with memcpy, but in C++ memcpy is the
only optionÂ».
A lot of such Â«knowledgeÂ» is hanging around since Â«C is a portable
assemblerÂ» times. :-\

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Hyman Rosen

2018-05-25 18:18:56 UTC

Post by Language Lawyer
A lot of such Â«knowledgeÂ» is hanging around since Â«C is a portable
assemblerÂ» times. :-\

If that knowledge isn't correct, it should be made to be correct. Having
the following three
items be true would make programming in C++ (and C) enormously less
error-prone:

1) A properly aligned and accessible block of memory that contains a valid
object
representation of an object contains the object.

2) Expressions are evaluated in strict left-to-right order, including side
effects.

3) All integer arithmetic has two's-complement semantics, and shifts and
overflows always
produce defined behavior.

Optional but useful fourth item:

4) Library routines that expect null-terminated arrays as input treat null
pointers as empty arrays.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Andrey Semashev

2018-05-25 19:29:02 UTC

On Fri, May 25, 2018 at 2:06 PM Language Lawyer
A lot of such «knowledge» is hanging around since «C is a portable
assembler» times. :-\
If that knowledge isn't correct, it should be made to be correct.

I disagree. As, I'm sure, many others on this list. You can call me and
other people who disagree optimizationists but I would say that being
able to write high performance code with as little overhead as possible
has been one of the key advantages of C and C++ for decades and, gladly,
it is not going away. If that doesn't fit your needs or preferences, you
should probably look at another language, and there's nothing wrong
about that. But please, avoid derailing topics into this kind of
discussions about how optimizationists rule C++.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Hyman Rosen

2018-05-25 21:10:52 UTC

Post by Andrey Semashev
But please, avoid derailing topics into this kind of
discussions about how optimizationists rule C++.

We are now 20 years on from the first C++ standard, and experts in C++
(I would say most of the people who post here qualify) aren't certain what
the standard says about
float f; int i = 42; memcpy(&f, &i, sizeof(f));

I disagree that it's derailing to argue that something is very wrong with
the
definition of the language.

Back to the topic, how about the following program?

#include <iostream>
#include <cstring>
#include <cmath>
#include <cfloat>
using namespace std;
static_assert((sizeof(int)) == (sizeof(float)), "int/float size differ");
const auto N = sizeof(float);
int main() {
int i = 42; char bufi[N]; memcpy(bufi, &i, N);
float f = 0.0f; char buff[N]; memcpy(buff, &f, N);
while (0 != memcmp(bufi, buff, N)) {
f = nextafter(f, FLT_MAX);
memcpy(buff, &f, N);
}
memcpy(&f, bufi, N); cout << f << "\n"; // #1
memcpy(&f, buff, N); cout << f << "\n"; // #2
}

Both #1 and #2 copy the same bytes into f, but the bytes in #1 came out of
a separate int and the bytes at #2 came out of f itself. Is one of these
cases
undefined behavior and the other not? If so, is it derailing to ask
whether it is
a good thing to have a language where the legality of the result of copying
bytes
into an object is path-dependent on the origin of the bytes rather than
their value?

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

'Edward Catmur' via ISO C++ Standard - Discussion

2018-05-25 21:27:41 UTC

Post by Hyman Rosen

Post by Andrey Semashev
But please, avoid derailing topics into this kind of
discussions about how optimizationists rule C++.

We are now 20 years on from the first C++ standard, and experts in C++
(I would say most of the people who post here qualify) aren't certain what
the standard says about
float f; int i = 42; memcpy(&f, &i, sizeof(f));
I disagree that it's derailing to argue that something is very wrong with
the
definition of the language.
Back to the topic, how about the following program?
#include <iostream>
#include <cstring>
#include <cmath>
#include <cfloat>
using namespace std;
static_assert((sizeof(int)) == (sizeof(float)), "int/float size differ");
const auto N = sizeof(float);
int main() {
int i = 42; char bufi[N]; memcpy(bufi, &i, N);
float f = 0.0f; char buff[N]; memcpy(buff, &f, N);
while (0 != memcmp(bufi, buff, N)) {
f = nextafter(f, FLT_MAX);
memcpy(buff, &f, N);
}
memcpy(&f, bufi, N); cout << f << "\n"; // #1
memcpy(&f, buff, N); cout << f << "\n"; // #2
}
Both #1 and #2 copy the same bytes into f, but the bytes in #1 came out of
a separate int and the bytes at #2 came out of f itself. Is one of these
cases
undefined behavior and the other not? If so, is it derailing to ask
whether it is
a good thing to have a language where the legality of the result of
copying bytes
into an object is path-dependent on the origin of the bytes rather than
their value?

It's a fair question, and one I've asked myself before. However note that
we have a similar situation with array pointers:

int a[2][2];
auto p = &a[0][2], q = &a[1][0];
assert(p == q);

But q may be indirected while p may not. So does the origin of a value
inform how it may be used.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Hyman Rosen

2018-05-25 21:48:17 UTC

I disagree that this should be the case, of course.

On Fri, May 25, 2018, 5:27 PM 'Edward Catmur' via ISO C++ Standard -

Post by 'Edward Catmur' via ISO C++ Standard - Discussion

On Fri, May 25, 2018 at 3:29 PM Andrey Semashev <

Post by Andrey Semashev
But please, avoid derailing topics into this kind of
discussions about how optimizationists rule C++.

We are now 20 years on from the first C++ standard, and experts in C++
(I would say most of the people who post here qualify) aren't certain what
the standard says about
float f; int i = 42; memcpy(&f, &i, sizeof(f));
I disagree that it's derailing to argue that something is very wrong with
the
definition of the language.
Back to the topic, how about the following program?
#include <iostream>
#include <cstring>
#include <cmath>
#include <cfloat>
using namespace std;
static_assert((sizeof(int)) == (sizeof(float)), "int/float size differ");
const auto N = sizeof(float);
int main() {
int i = 42; char bufi[N]; memcpy(bufi, &i, N);
float f = 0.0f; char buff[N]; memcpy(buff, &f, N);
while (0 != memcmp(bufi, buff, N)) {
f = nextafter(f, FLT_MAX);
memcpy(buff, &f, N);
}
memcpy(&f, bufi, N); cout << f << "\n"; // #1
memcpy(&f, buff, N); cout << f << "\n"; // #2
}
Both #1 and #2 copy the same bytes into f, but the bytes in #1 came out of
a separate int and the bytes at #2 came out of f itself. Is one of
these cases
undefined behavior and the other not? If so, is it derailing to ask
whether it is
a good thing to have a language where the legality of the result of
copying bytes
into an object is path-dependent on the origin of the bytes rather than
their value?

It's a fair question, and one I've asked myself before. However note that
int a[2][2];
auto p = &a[0][2], q = &a[1][0];
assert(p == q);
But q may be indirected while p may not. So does the origin of a value
inform how it may be used.

--

---
You received this message because you are subscribed to the Google Groups
"ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at
https://groups.google.com/a/isocpp.org/group/std-discussion/.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Thiago Macieira

2018-05-26 02:22:33 UTC

Post by Hyman Rosen
We are now 20 years on from the first C++ standard, and experts in C++
(I would say most of the people who post here qualify) aren't certain what
the standard says about
float f; int i = 42; memcpy(&f, &i, sizeof(f));

I'd say everyone is quite clear that the value of f after this is not
specified by the standard.

But we can get clear of floating point. Also of all I/O. Let's we can stick to
a very, very simple case:

bool f()
{
uint32_t ui; = 0;
uint16_t us = 42;
memcpy(&ui, &us, sizeof(us));
return ui == 42;
}

Does this function return true?

There's no floating point with its unknown representation format. There is no
signed integer with trap representations. Everything here is unsigned
integers. And yet we can't have an answer: we don't want to.

The ui variable has a value. The C standard says that the bits are copied and
then reinterpreted, but doesn't say what the value is. The C++ standard says
nothing about the operation at all. I think it should adopt the wording from
C, though.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Hyman Rosen

2018-05-27 19:39:53 UTC

Post by Thiago Macieira

Post by Hyman Rosen
We are now 20 years on from the first C++ standard, and experts in C++
(I would say most of the people who post here qualify) aren't certain

what

Post by Hyman Rosen
the standard says about
float f; int i = 42; memcpy(&f, &i, sizeof(f));

I'd say everyone is quite clear that the value of f after this is not
specified by the standard.

I don't want the standard to specify the value of f. I want the standard to
say that
if the result of the memcpy is such that f contains the object
representation of a
valid floating-point value (and for IEEE, it must, since there are no
excluded bit
patterns), then f contains that value, and may be used in all ways as
such. In
particular, the compiler is not permitted to treat f as if it were
uninitialized.

But we can get clear of floating point. Also of all I/O. Let's we can stick

Post by Thiago Macieira
to
bool f()
{
uint32_t ui; = 0;
uint16_t us = 42;
memcpy(&ui, &us, sizeof(us));
return ui == 42;
}
Does this function return true?

This is the same case. I do not want the standard to tell us whether this
returns true. I want the standard to say that the memcpy has modified the
object representation of ui, and that if ui contains a valid object
representation of a uint32_t, then ui has that value. (Results will differ
based on the endianess of the implementation, for example.)

Post by Thiago Macieira
And yet we can't have an answer: we don't want to.

Of course we can. The compiler knows the endianness of the system, and if
it's smart enough to track the copied value, it will know that ui now
contains 0
or 42, and it can compile away the copy. If it doesn't track, things just
get
evaluated at runtime.

The ui variable has a value. The C standard says that the bits are copied

Post by Thiago Macieira
and
then reinterpreted, but doesn't say what the value is. The C++ standard says
nothing about the operation at all. I think it should adopt the wording from
C, though.

Yes.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Thiago Macieira

2018-05-27 20:49:00 UTC

Post by Hyman Rosen

Post by Thiago Macieira

Post by Hyman Rosen
We are now 20 years on from the first C++ standard, and experts in C++
(I would say most of the people who post here qualify) aren't certain

what

Post by Hyman Rosen
the standard says about
float f; int i = 42; memcpy(&f, &i, sizeof(f));

I'd say everyone is quite clear that the value of f after this is not
specified by the standard.

I don't want the standard to specify the value of f. I want the standard to
say that
if the result of the memcpy is such that f contains the object
representation of a
valid floating-point value (and for IEEE, it must, since there are no
excluded bit
patterns), then f contains that value, and may be used in all ways as
such. In
particular, the compiler is not permitted to treat f as if it were
uninitialized.

I agree the compiler cannot treat as uninitialised.

The standard does not require the floating point types to be of any particular
format. But it does provide a way to determine if they are IEC 559 / IEEE 754.
If the implementation does have std::numeric_limits<float>::is_iec559 = true,
then your valid patterns for IEEE 754 should be accepted too.

Post by Hyman Rosen

Post by Thiago Macieira
to
bool f()
{
uint32_t ui; = 0;
uint16_t us = 42;
memcpy(&ui, &us, sizeof(us));
return ui == 42;
}
Does this function return true?

This is the same case. I do not want the standard to tell us whether this
returns true. I want the standard to say that the memcpy has modified the
object representation of ui, and that if ui contains a valid object
representation of a uint32_t, then ui has that value. (Results will differ
based on the endianess of the implementation, for example.)

I agree.

Post by Hyman Rosen

Post by Thiago Macieira
The ui variable has a value. The C standard says that the bits are copied and
then reinterpreted, but doesn't say what the value is. The C++ standard says
nothing about the operation at all. I think it should adopt the wording from
C, though.

Yes.

--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Florian Weimer

2018-05-26 12:21:25 UTC

Post by Hyman Rosen
If that knowledge isn't correct, it should be made to be correct. Having
the following three
items be true would make programming in C++ (and C) enormously less
1) A properly aligned and accessible block of memory that contains a
valid object representation of an object contains the object.

I don't think this is really possible to achieve without library
implementation changes. Clear semantics also seem to require garbage
collection.

The reason for my concerns is that the semantics of objects can depend
on their address. This can either be due to internal pointers within
the object itself, or some external dependency.

The C11 mutexes are a good example for the latter. The standard is
ambiguously worded and seems to require the behavior you describe, but
everyone agrees that an implementation which makes the address of the
mutex a significant part of its behavior is a valid implementation.

It is of course possible to get the copying behavior if all
non-copyable objects have incomplete (or private) types, are hidden
behind pointers, and are allocated somewhere else. The pointers can
then be put into the public data structures defined by the library.
It leads to clear semantics regarding object copying (especially with
garbage collection, hence my initial remark), but it's not really C
(or C++) anymore.

Post by Hyman Rosen
2) Expressions are evaluated in strict left-to-right order, including side
effects.
3) All integer arithmetic has two's-complement semantics, and shifts
and overflows always produce defined behavior.

Sure, that would be nice, especially if we could agree on the exact
behavior of such shifts (because existing silicon disagrees).

Post by Hyman Rosen
4) Library routines that expect null-terminated arrays as input treat null
pointers as empty arrays.

That's not compatible with existing practice, where NULL is
interpreted differently from an empty array. Even the much more
restrictive viewpoint, that functions processing arrays should accept
null pointers if the array length is given as zero, is controversial.
(Considering that std::vector::data() can return nullptr for an empty
vector, I personally think we need to allow nullptr for zero-length
memset/memcpy/memcmp, but that seems a minority view.)

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Richard Hodges

2018-05-26 20:42:19 UTC

Post by Florian Weimer

Post by Hyman Rosen
If that knowledge isn't correct, it should be made to be correct. Having
the following three
items be true would make programming in C++ (and C) enormously less
1) A properly aligned and accessible block of memory that contains a
valid object representation of an object contains the object.

I don't think this is really possible to achieve without library
implementation changes. Clear semantics also seem to require garbage
collection.
The reason for my concerns is that the semantics of objects can depend
on their address. This can either be due to internal pointers within
the object itself, or some external dependency.
The C11 mutexes are a good example for the latter. The standard is
ambiguously worded and seems to require the behavior you describe, but
everyone agrees that an implementation which makes the address of the
mutex a significant part of its behavior is a valid implementation.
It is of course possible to get the copying behavior if all
non-copyable objects have incomplete (or private) types, are hidden
behind pointers, and are allocated somewhere else. The pointers can
then be put into the public data structures defined by the library.
It leads to clear semantics regarding object copying (especially with
garbage collection, hence my initial remark), but it's not really C
(or C++) anymore.

Post by Hyman Rosen
2) Expressions are evaluated in strict left-to-right order, including

side

Post by Hyman Rosen
effects.
3) All integer arithmetic has two's-complement semantics, and shifts
and overflows always produce defined behavior.

Sure, that would be nice, especially if we could agree on the exact
behavior of such shifts (because existing silicon disagrees).

Post by Hyman Rosen
4) Library routines that expect null-terminated arrays as input treat

null

Post by Hyman Rosen
pointers as empty arrays.

That's not compatible with existing practice, where NULL is
interpreted differently from an empty array. Even the much more
restrictive viewpoint, that functions processing arrays should accept
null pointers if the array length is given as zero, is controversial.
(Considering that std::vector::data() can return nullptr for an empty
vector, I personally think we need to allow nullptr for zero-length
memset/memcpy/memcmp, but that seems a minority view.)

I find it odd that this is a minority view. No part of a zero-length array
can be accessed, so the values of the pointers that mark the start of such
an array should be irrelevant to any sane implementation of any
array-manipulating library function.

Post by Florian Weimer
--
---
You received this message because you are subscribed to the Google Groups
"ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at
https://groups.google.com/a/isocpp.org/group/std-discussion/.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Thiago Macieira

2018-05-26 20:49:44 UTC

Post by Richard Hodges

Post by Florian Weimer
(Considering that std::vector::data() can return nullptr for an empty
vector, I personally think we need to allow nullptr for zero-length
memset/memcpy/memcmp, but that seems a minority view.)

I find it odd that this is a minority view. No part of a zero-length array
can be accessed, so the values of the pointers that mark the start of such
an array should be irrelevant to any sane implementation of any
array-manipulating library function.

I agree with you if you pass a length. So I don't see why either argument of
memcpy or memcmp can't be a nullptr if the size is zero.

That's not the case for NULL-terminated string functions.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Florian Weimer

2018-05-26 22:12:27 UTC

Post by Richard Hodges

Post by Florian Weimer
That's not compatible with existing practice, where NULL is
interpreted differently from an empty array. Even the much more
restrictive viewpoint, that functions processing arrays should accept
null pointers if the array length is given as zero, is controversial.
(Considering that std::vector::data() can return nullptr for an empty
vector, I personally think we need to allow nullptr for zero-length
memset/memcpy/memcmp, but that seems a minority view.)

I find it odd that this is a minority view.

It's what's currently in the C and C++ standards, and glibc has
annotations which GCC could use to optimize based on these
preconditions:

# define __nonnull(params) __attribute__ ((__nonnull__ params))
…
extern void *memset (void *__s, int __c, size_t __n) __THROW __nonnull ((1));

The languages have a strained relationship with zero-sized objects.
It would be quite consistent to disallow zero-length
memset/memcpy/memcmp altogether.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Richard Hodges

2018-05-27 13:20:29 UTC

Post by Florian Weimer

Post by Richard Hodges

Post by Florian Weimer
That's not compatible with existing practice, where NULL is
interpreted differently from an empty array. Even the much more
restrictive viewpoint, that functions processing arrays should accept
null pointers if the array length is given as zero, is controversial.
(Considering that std::vector::data() can return nullptr for an empty
vector, I personally think we need to allow nullptr for zero-length
memset/memcpy/memcmp, but that seems a minority view.)

I find it odd that this is a minority view.

It's what's currently in the C and C++ standards, and glibc has
annotations which GCC could use to optimize based on these
# define __nonnull(params) __attribute__ ((__nonnull__ params))
âŠ
extern void *memset (void *__s, int __c, size_t __n) __THROW __nonnull ((1));
The languages have a strained relationship with zero-sized objects.
It would be quite consistent to disallow zero-length
memset/memcpy/memcmp altogether.

This I accept, but I don't think there should be any strain. Zero is merely
a specialisation of N. Memcpy et al already check the value of N in order
to handle the tail ends of non-aligned moves. To refuse one more check for
zero seems to me to be a premature optimisation, which forces inelegance
onto the user of the functions.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Hyman Rosen

2018-05-27 19:01:32 UTC

Zero is merely a specialisation of N.

I would guess that this relates back to Intel segmented architecture
semantics.
Conceivably, memcpy could just load up its parameters into address
registers,
and null pointers could cause hardware exceptions, so the functions were
(un)defined not to allow that.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Thiago Macieira

2018-05-27 20:54:04 UTC

Post by Hyman Rosen

Zero is merely a specialisation of N.

I would guess that this relates back to Intel segmented architecture
semantics.
Conceivably, memcpy could just load up its parameters into address
registers,
and null pointers could cause hardware exceptions, so the functions were
(un)defined not to allow that.

That's not the case.

The null segment descriptor on 32-bit protected mode x86 or 64-bit long mode
does not cause an exception just by being loaded into a segment register.
Similarly, the null address does not cause an exception by being loaded onto a
branch register on IA-64.

That's not the case for invalid addresses, on either architecture. So
memcpy(invalid, valid, 0) or memcpy(valid, invalid, 0) could fault, despite
the size zero.

I don't know about other architectures, but it's not inconceivable.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Hyman Rosen

2018-05-27 19:09:49 UTC

Post by Florian Weimer

Post by Hyman Rosen
1) A properly aligned and accessible block of memory that contains a
valid object representation of an object contains the object.

I don't think this is really possible to achieve without library
implementation changes. Clear semantics also seem to require garbage
collection.

No, I think you misunderstand what I mean by "valid object representation."

Post by Florian Weimer
The reason for my concerns is that the semantics of objects can depend
on their address. This can either be due to internal pointers within
the object itself, or some external dependency.

For sure. And objects with virtual bases or functions can have embedded
hidden
data whose semantics aren't defined by the standard. But I thought my
"weasel
wording" was clear enough. When I say "contains a valid object
representation,"
I mean just that - the data has to look like a valid object, including
incorporating
the address where it's located. In the most general sense, it requires
that the
programmer know the implementation details of the language and compiler, and
duplicate what the compiler does when creating objects. But what I want is
for the
language to give the programmer a fighting chance. What we have now are
compilers using every excuse to prevent such code from working.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Florian Weimer

2018-06-09 18:21:21 UTC

Post by Hyman Rosen

Post by Florian Weimer

Post by Hyman Rosen
1) A properly aligned and accessible block of memory that contains a
valid object representation of an object contains the object.

I don't think this is really possible to achieve without library
implementation changes. Clear semantics also seem to require garbage
collection.

No, I think you misunderstand what I mean by "valid object
representation."
When I say "contains a valid object representation," I mean just
that - the data has to look like a valid object, including
incorporating the address where it's located. In the most general
sense, it requires that the programmer know the implementation
details of the language and compiler, and duplicate what the
compiler does when creating objects.

The programmer also has to follow the (for the lack of a better word)
temporal constraints of the language, otherwise the contents might not
be observable as an object. For example, if the implementation can
observe that particular type of object is never created explictly, it
may assume so throughout the program.

My impression is that the main challenges lie not so much in the
implementation-specific interpretation of bit patterns (where, as far
as I know, historically a lot of the undefinedness of type punning
derived from, simply because no one wanted to specify what reading the
individual bits of a float would give as a result), but the temporal
effects of object existence in modern compilers. And I don't think
it's possible to fix that with a simple proposal and isolated wording
changes.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Hyman Rosen

2018-06-10 22:57:25 UTC

Post by Florian Weimer
For example, if the implementation can
observe that particular type of object is never created explictly, it
may assume so throughout the program.

Not in my version of the language. In my version, if a block of properly
aligned memory contains a valid representation of an object of that type,
then a pointer to that block cast to the object type points to an object of
that type, and may be used as such.

And I don't think

Post by Florian Weimer
it's possible to fix that with a simple proposal and isolated wording
changes.

I don't see why my proposed rule (as restated above) wouldn't work.
But remember that I hold no truck with optimizationism, so that arguments
that my rule would prevent the compiler from optimizing certain constructs
don't matter to me.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Andrey Semashev

2018-05-25 19:07:28 UTC

Post by Thiago Macieira

Post by Language Lawyer

Post by Jens Maurer

Post by Language Lawyer
Where exactly it is defined?

What exactly do you mean?
Could you give some example code, please?

float f;
int i = 42;
memcpy(&f, &i, sizeof(f));

There's no type punning in this example.
Really? Probably I was mislead by some «common "knowledge"» that «in C
you can do type punning through union or with memcpy, but in C++ memcpy
is the only option».
A lot of such «knowledge» is hanging around since «C is a portable
assembler» times. :-\

Type punning implies that you reinterpret the bits of representation of
an object of one type T1 as an object of another type T2. Simply copying
the bits around does not require their interpretation and is
well-defined as long as the type is trivially copyable. Interpretation
of the bits happens when you use the object in some context, like a math
expression, for example. This is only well-defined if T1 and T2 are the
same type (barring cv-qualification).

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Language Lawyer

2018-05-25 19:47:41 UTC

Post by Andrey Semashev

Post by Thiago Macieira

Post by Language Lawyer

Post by Jens Maurer

Post by Language Lawyer
Where exactly it is defined?

What exactly do you mean?
Could you give some example code, please?

float f;
int i = 42;
memcpy(&f, &i, sizeof(f));

There's no type punning in this example.
Really? Probably I was mislead by some Â«common "knowledge"Â» that Â«in C
you can do type punning through union or with memcpy, but in C++ memcpy
is the only optionÂ».
A lot of such Â«knowledgeÂ» is hanging around since Â«C is a portable
assemblerÂ» times. :-\

Type punning implies that you reinterpret the bits of representation of
an object of one type T1 as an object of another type T2. Simply copying
the bits around does not require their interpretation and is
well-defined as long as the type is trivially copyable. Interpretation
of the bits happens when you use the object in some context, like a math
expression, for example. This is only well-defined if T1 and T2 are the
same type (barring cv-qualification).

Ok, ok. The example is incomplete. It was implied that `f` will be used
after memcpy somehow as an object of `float` type.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Thiago Macieira

2018-05-26 02:30:10 UTC

Post by Language Lawyer
Ok, ok. The example is incomplete. It was implied that `f` will be used
after memcpy somehow as an object of `float` type.

Then the behaviour is at least unspecified. There's no telling what the bytes
represent in float. Your application may receive a SIGFPE due to consuming an
invalid FP.

Like I've just posted in another email: I think the C++ standard should adopt
C's wording and say this is a valid operation and the bits copied are
reinterpreted according to the type's representation. Whether that's a valid
representation or not, what each bit means, is left unspecified and is up to
the programmer to know according to the target hardware and ABI.

Also note that this is not an optimisation blackhole. For example, the
compiler may know that value 0x7f800000 when copied onto a float results in
positive infinity proceed to evaluate all finite comparisons as less than this
value.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Language Lawyer

2018-05-25 19:59:08 UTC

Post by Jens Maurer

Post by Language Lawyer
Where exactly it is defined?

What exactly do you mean?
Could you give some example code, please?

More complete example is a well-known Quake inverse square root
optimization:
https://stackoverflow.com/questions/17789928/whats-a-proper-way-of-type-punning-a-float-to-an-int-and-vice-versa
Basically the question here is whether the accepted answer is well defined.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

'Edward Catmur' via ISO C++ Standard - Discussion

2018-05-25 21:22:25 UTC

Post by Language Lawyer

Post by Jens Maurer

Post by Language Lawyer
Where exactly it is defined?

What exactly do you mean?
Could you give some example code, please?

More complete example is a well-known Quake inverse square root
https://stackoverflow.com/questions/17789928/whats-a-proper-way-of-type-punning-a-float-to-an-int-and-vice-versa
Basically the question here is whether the accepted answer is well defined.

The answer is correct in that the technique it describes will work on every
non-pathological implementation. It is undefined in the sense that an
implementation could invoke undefined behavior on that code and still be
conformant. It is to be hoped that future versions of the standard will fix
this or provide an alternate mechanism (likely bit_cast, above).

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Edward Catmur

2018-05-25 09:31:57 UTC

Post by Language Lawyer
Where exactly it is defined?
Is this http://eel.is/c++draft/cstring.syn#1: Â«The contents and meaning
of the header <cstring> are the same as the C standard library header
<string.h>Â» plus definition from the C standard: Â«The memcpy function
copies n characters from the object pointed to by s2 into the object
pointed to by s1Â» enough to make it defined?

Yes. C 7.24.1p3 with 7.24.2.1p2 specifies that memcpy copies n characters
as unsigned char. Copying as unsigned char is permitted by [basic.lval]/11.
In addition, memcpy is mentioned in the footnotes and examples to
[basic.types]/2-3.

Even if memcpy does something better internally (copying up to 64 bytes at
once using SIMD instructions, say) it is still required to behave as if it
is copying bytes singly (though in an unspecified order).

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Language Lawyer

2018-05-25 11:13:30 UTC

Post by Edward Catmur

Post by Language Lawyer
Where exactly it is defined?
Is this http://eel.is/c++draft/cstring.syn#1: Â«The contents and meaning
of the header <cstring> are the same as the C standard library header
<string.h>Â» plus definition from the C standard: Â«The memcpy function
copies n characters from the object pointed to by s2 into the object
pointed to by s1Â» enough to make it defined?

Yes. C 7.24.1p3 with 7.24.2.1p2 specifies that memcpy copies n characters
as unsigned char. Copying as unsigned char is permitted by [basic.lval]/11.
In addition, memcpy is mentioned in the footnotes and examples to
[basic.types]/2-3.

[basic.types]/2-3 is about copying object into array of chars and back or
copying between 2 objects of the same type. This is not punning.
Also, are footnotes normative?

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Ville Voutilainen

2018-05-25 11:14:09 UTC

Post by Language Lawyer
Also, are footnotes normative?

No.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Language Lawyer

2018-05-25 11:17:58 UTC

Post by Language Lawyer
Also, are footnotes normative?

No

Ok. What allows to copy objects of trivially copyable types with memcpy?
The paragraph I've cited in the first message?

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Ville Voutilainen

2018-05-25 11:31:00 UTC

Post by Language Lawyer
Also, are footnotes normative?

No

Ok. What allows to copy objects of trivially copyable types with memcpy? The

Nothing that I'm aware of. You need to round-trip via a char/unsigned
char/byte array.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Language Lawyer

2018-05-25 11:52:39 UTC

Post by Ville Voutilainen

Post by Language Lawyer

Post by Language Lawyer
Also, are footnotes normative?

No

Ok. What allows to copy objects of trivially copyable types with memcpy?

The
Nothing that I'm aware of. You need to round-trip via a char/unsigned
char/byte array.

And. What allows this round-trip? :-/

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Ville Voutilainen

2018-05-25 12:44:07 UTC

Post by Language Lawyer

Post by Ville Voutilainen

Post by Language Lawyer
Also, are footnotes normative?

No

Ok. What allows to copy objects of trivially copyable types with memcpy? The

Nothing that I'm aware of. You need to round-trip via a char/unsigned
char/byte array.

And. What allows this round-trip? :-/

[basic.types]/2, although it doesn't say very clearly that copying the
bytes back is well-defined.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Language Lawyer

2018-05-25 18:14:18 UTC

Post by Language Lawyer

Post by Language Lawyer

Post by Ville Voutilainen

Post by Language Lawyer

Post by Language Lawyer
Also, are footnotes normative?

No

Ok. What allows to copy objects of trivially copyable types with

memcpy?

Post by Language Lawyer

Post by Ville Voutilainen

Post by Language Lawyer
The

Nothing that I'm aware of. You need to round-trip via a char/unsigned
char/byte array.

And. What allows this round-trip? :-/

[basic.types]/2, although it doesn't say very clearly that copying the
bytes back is well-defined.

But the [basic.types]/2 is similar to /3 in the sense it doesn't mention
memcpy in normative contexts.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Ville Voutilainen

2018-05-25 19:05:11 UTC

Post by Language Lawyer

Post by Ville Voutilainen
[basic.types]/2, although it doesn't say very clearly that copying the
bytes back is well-defined.

But the [basic.types]/2 is similar to /3 in the sense it doesn't mention
memcpy in normative contexts.

It's trying to avoid being overly restrictive, so that memcpy/memmove,
a manual char-copying loop
or whatever mechanism are all allowed.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Language Lawyer

2018-05-25 19:34:31 UTC

Post by Edward Catmur

Post by Language Lawyer

Post by Ville Voutilainen
[basic.types]/2, although it doesn't say very clearly that copying the
bytes back is well-defined.

But the [basic.types]/2 is similar to /3 in the sense it doesn't

mention

Post by Language Lawyer
memcpy in normative contexts.

It's trying to avoid being overly restrictive, so that memcpy/memmove,
a manual char-copying loop
or whatever mechanism are all allowed.

Is char-copying loop well defined? https://wg21.cmeerw.net/cwg/issue1701
makes me doubt.
But I'll accept that memcpy "magically" works.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

'Edward Catmur' via ISO C++ Standard - Discussion

2018-05-25 15:28:29 UTC

Post by Language Lawyer

Post by Language Lawyer
Also, are footnotes normative?

No

Ok. What allows to copy objects of trivially copyable types with memcpy?
The paragraph I've cited in the first message?
If the objects are of the exact same type, [basic.types]/3. Otherwise

(even for types whose aliasing is permitted by [basic.lval]/11), memcpy
between them is not permitted, whether using an intermediate byte array or
copying directly.

In practice, implementations are fine with this copying on condition the
object representation of the source value is the object representation of a
value of the destination type.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Nicol Bolas

2018-05-25 19:47:55 UTC

Post by 'Edward Catmur' via ISO C++ Standard - Discussion

Post by Language Lawyer

Post by Language Lawyer
Also, are footnotes normative?

No

Ok. What allows to copy objects of trivially copyable types with memcpy?
The paragraph I've cited in the first message?
If the objects are of the exact same type, [basic.types]/3. Otherwise

(even for types whose aliasing is permitted by [basic.lval]/11), memcpy
between them is not permitted, whether using an intermediate byte array or
copying directly.

Well, it is technically legal to byte copy into another object. However, by
doing so you will have reused the storage of that "another object" and
therefore it no longer exists. So while you can copy those bytes out of the
object, you cannot access that object through a glvalue of that type
anymore.

So you can copy the bytes of an `int i` into a `float f`, but it's not a
float anymore, and you can't use `f` to access it as if it were a float.

Post by 'Edward Catmur' via ISO C++ Standard - Discussion
In practice, implementations are fine with this copying on condition the
object representation of the source value is the object representation of a
value of the destination type.

And of course, `std::bitcast` makes this explicit.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Ville Voutilainen

2018-05-25 19:54:06 UTC

Post by Nicol Bolas

Post by Language Lawyer

Post by Language Lawyer
Ok. What allows to copy objects of trivially copyable types with memcpy?
The paragraph I've cited in the first message?

If the objects are of the exact same type, [basic.types]/3. Otherwise
(even for types whose aliasing is permitted by [basic.lval]/11), memcpy
between them is not permitted, whether using an intermediate byte array or
copying directly.

Well, it is technically legal to byte copy into another object. However, by

A major point of this thread is that nothing says so.

Post by Nicol Bolas
doing so you will have reused the storage of that "another object" and
therefore it no longer exists. So while you can copy those bytes out of the
object, you cannot access that object through a glvalue of that type
anymore.
So you can copy the bytes of an `int i` into a `float f`, but it's not a
float anymore, and you can't use `f` to access it as if it were a float.

Copying the representation bytes of an object to the storage of another object
of different type doesn't change the type in that storage, at least
not according
to anything the standard says. To change the type in that storage, you need
to reuse that storage for a different type, which doesn't happen by copying
bytes over that storage, it happens by copying an int over that storage, or
otherwise creating an int in that storage.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Language Lawyer

2018-05-25 19:56:08 UTC

Post by Nicol Bolas

Post by 'Edward Catmur' via ISO C++ Standard - Discussion

Post by Language Lawyer

Post by Language Lawyer
Also, are footnotes normative?

No

Ok. What allows to copy objects of trivially copyable types with memcpy?
The paragraph I've cited in the first message?
If the objects are of the exact same type, [basic.types]/3. Otherwise

(even for types whose aliasing is permitted by [basic.lval]/11), memcpy
between them is not permitted, whether using an intermediate byte array or
copying directly.

Well, it is technically legal to byte copy into another object. However,
by doing so you will have reused the storage of that "another object" and
therefore it no longer exists. So while you can copy those bytes out of the
object, you cannot access that object through a glvalue of that type
anymore.

Are you sure that memcpy-ing ends lifetime of an object by reusing its
storage? I think only placement new is allowed to do this.
AFAIK, in C, memcpy is a way to change effective type of an object. But
only if it doesn't have declared type. Here the object has declared type
and also in C++ there is no such thing as "effective type".

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Nicol Bolas

2018-05-26 00:31:39 UTC

Post by Language Lawyer

Post by Nicol Bolas

Post by 'Edward Catmur' via ISO C++ Standard - Discussion

Post by Language Lawyer

Post by Language Lawyer
Also, are footnotes normative?

No

Ok. What allows to copy objects of trivially copyable types with
memcpy? The paragraph I've cited in the first message?
If the objects are of the exact same type, [basic.types]/3. Otherwise

(even for types whose aliasing is permitted by [basic.lval]/11), memcpy
between them is not permitted, whether using an intermediate byte array or
copying directly.

Well, it is technically legal to byte copy into another object. However,
by doing so you will have reused the storage of that "another object" and
therefore it no longer exists. So while you can copy those bytes out of the
object, you cannot access that object through a glvalue of that type
anymore.

Are you sure that memcpy-ing ends lifetime of an object by reusing its
storage? I think only placement new is allowed to do this.

That could be. I assumed that copying over the bytes of an object counted
as reuse, but I can see now that [basic.life]/1 says "reused by an object",
and of course, no object is reusing it.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

51 Replies
44 Views
Permalink to this page
Disable enhanced parsing

Thread Navigation

Language Lawyer 2018-05-25 03:00:12 UTC

Jens Maurer 2018-05-25 08:19:30 UTC

Language Lawyer 2018-05-25 11:08:12 UTC

Thiago Macieira 2018-05-25 12:58:04 UTC

'Edward Catmur' via ISO C++ Standard - Discussion 2018-05-25 15:37:21 UTC

Thiago Macieira 2018-05-25 15:48:29 UTC

Hyman Rosen 2018-05-25 15:59:30 UTC

Thiago Macieira 2018-05-25 19:39:42 UTC

'Edward Catmur' via ISO C++ Standard - Discussion 2018-05-25 17:01:04 UTC

Language Lawyer 2018-06-13 21:04:58 UTC

Thiago Macieira 2018-06-14 04:31:49 UTC

Language Lawyer 2018-06-13 22:17:11 UTC

'Edward Catmur' via ISO C++ Standard - Discussion 2018-06-14 18:05:26 UTC

Language Lawyer 2018-05-25 18:06:06 UTC

Hyman Rosen 2018-05-25 18:18:56 UTC

Andrey Semashev 2018-05-25 19:29:02 UTC

Hyman Rosen 2018-05-25 21:10:52 UTC

'Edward Catmur' via ISO C++ Standard - Discussion 2018-05-25 21:27:41 UTC

Hyman Rosen 2018-05-25 21:48:17 UTC

Thiago Macieira 2018-05-26 02:22:33 UTC

Hyman Rosen 2018-05-27 19:39:53 UTC

Thiago Macieira 2018-05-27 20:49:00 UTC

Florian Weimer 2018-05-26 12:21:25 UTC

Richard Hodges 2018-05-26 20:42:19 UTC

Thiago Macieira 2018-05-26 20:49:44 UTC

Florian Weimer 2018-05-26 22:12:27 UTC

Richard Hodges 2018-05-27 13:20:29 UTC

Hyman Rosen 2018-05-27 19:01:32 UTC

Thiago Macieira 2018-05-27 20:54:04 UTC

Hyman Rosen 2018-05-27 19:09:49 UTC

Florian Weimer 2018-06-09 18:21:21 UTC

Hyman Rosen 2018-06-10 22:57:25 UTC

Andrey Semashev 2018-05-25 19:07:28 UTC

Language Lawyer 2018-05-25 19:47:41 UTC

Thiago Macieira 2018-05-26 02:30:10 UTC

Language Lawyer 2018-05-25 19:59:08 UTC

'Edward Catmur' via ISO C++ Standard - Discussion 2018-05-25 21:22:25 UTC

Edward Catmur 2018-05-25 09:31:57 UTC

Language Lawyer 2018-05-25 11:13:30 UTC

Ville Voutilainen 2018-05-25 11:14:09 UTC

Language Lawyer 2018-05-25 11:17:58 UTC

Ville Voutilainen 2018-05-25 11:31:00 UTC

Language Lawyer 2018-05-25 11:52:39 UTC

Ville Voutilainen 2018-05-25 12:44:07 UTC

Language Lawyer 2018-05-25 18:14:18 UTC

Ville Voutilainen 2018-05-25 19:05:11 UTC

Language Lawyer 2018-05-25 19:34:31 UTC

'Edward Catmur' via ISO C++ Standard - Discussion 2018-05-25 15:28:29 UTC

Nicol Bolas 2018-05-25 19:47:55 UTC

Ville Voutilainen 2018-05-25 19:54:06 UTC

Language Lawyer 2018-05-25 19:56:08 UTC

Nicol Bolas 2018-05-26 00:31:39 UTC

about - legalese

Loading...