[std-discussion] Comitee stance on using aligned char arrays as raw storage without placement new for trivial types?

I cannot tell you if any discussion has been had. However, I can give you
some of the history of what in C++ prevents merely allocating memory from
creating an object. That in itself is not definitive about discussions, but
it is suggestive.

I do not actually have a copy of C++98 or C++03. The oldest working draft I
can find that is available is N1638, which was released in 2004.

An object is created by a definition (3.1), by a new-expression (5.3.4)

or by the implementation (12.2) when needed.

I do have late drafts of C++11 and C++14. I'm not going to quote from their
version of this section because they all say *the exact same thing*.

N4616, the current working draft leading into C++17, however, does change

An object is created by a definition (3.1), by a new-expression (5.3.4),

when implicitly changing the active member of a union (9.3), or when a
temporary object is created (4.4, 12.2).

So the only change has been essentially a defect fix that makes unions
actually work, in accord with the standard.

In at least 12 years of standardization, the committee has made no
substantive change to the causes of bringing an object into being. While
this is not conclusive, the fact that C++17 did put a fix into this section
means that they have looked at it and talked about it at some point. So I
would suggest that, if there was discussion about it, it did not progress
beyond discussion.

Hi.
I'm currently trying to understand a few ... interesting ... observations
I have been making wrt. the C++ Standard and using char arrays as raw
storage.
Essentially, as far as I can tell (have been told), the current C++
Standard only allows using a char array as raw storage (see also
std::aligned_storage) when objects are put into this via placement new,
even for e.g. int or other trivial(*) types.
http://stackoverflow.com/questions/41624685/is-placement-new-legally-required-for-putting-an-int-into-a-char-array
alignas(int) char buf[sizeof(int)];
void f() {
// turn the memory into an int: (??) from the POV of the abstract machine!
::new (buf) int; // is this strictly required? (aside: it's obviously a no-op)
*((int*)buf) = 42; // for this discussion, just assume the cast itself yields the correct pointer value}
Now, I'm **not** asking whether the current C++ Standard requires - or not - the noop placement new for this code to be defined.
*What I would be interested in is whether this has been discussed in the committee (CWG?) in the last very few yearsand whether there is any agreement if omitting the placement new (for trivial type) should be allowed or if Standard C++ should absolutely require the placement new.*
Simple links to any paper(s) discussing this would be already appreciated, the only reference I found was P0137R1, and that's more about clarifying current wording afaikt.
Thanks.
- Martin
p.s.: (*) is "trivial type" the correct term?

Correct term for what? Trivial Type is *a term* in C++, but it's unclear
what it would mean for what you want to do.

Conceptually, a TrivialType is a type which is a pure block-of-bits, one
for which any value of those bits is no less legal than any other. But C++
has other kinds of types.

A TriviallyCopyable type is a type for which a byte-by-byte copy operation
is equivalent to a language-level copy or move operation. A
TriviallyDefaultConstructible type is a type for which being uninitialized
is a legitimate state. A TriviallyDestructible type is a type whose
destruction is essentially irrelevant and can be ignored.

p.p.s.: My personal impression on the matter is that requiring the placement new for trivial types (like int, ...) is rather insane and the amount of real world code compiled with C++ compilers

that would be broken should any C++ compiler/optimizer ever manage to actually treat this as UB is quite huge. 'Course I may be totally off here. Just take this as a disclaimer :-)

Compilers *do* treat it as UB. UB doesn't mean "crash"; UB can still do
what you want.

The point of the UB designation is to allow implementations to be
reasonably fast. If you reinterpret cast a pointer to a different type, the
compiler doesn't have to check to see if that object really exists there;
it will simply trust your cast and pretend that there is an object there.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Martin Ba

2017-01-16 19:08:38 UTC

I cannot tell you if any discussion has been had. However, ...
... So the only change has been essentially a defect fix that makes unions
actually work, in accord with the standard.
In at least 12 years of standardization, the committee has made no
substantive change to the causes of bringing an object into being. While
this is not conclusive, the fact that C++17 did put a fix into this section
means that they have looked at it and talked about it at some point. So I
would suggest that, if there was discussion about it, it did not progress
beyond discussion.

Thanks a lot for that wrap up!

-snip-

Post by Martin Ba
p.p.s.: My personal impression on the matter is that requiring the placement new for trivial types (like int, ...) is rather insane and the amount of real world code compiled with C++ compilers
that would be broken should any C++ compiler/optimizer ever manage to actually treat this as UB is quite huge. 'Course I may be totally off here. Just take this as a disclaimer :-)

Compilers *do* treat it as UB. UB doesn't mean "crash"; UB can still do
what you want.
The point of the UB designation is to allow implementations to be
reasonably fast. If you reinterpret cast a pointer to a different type, the
compiler doesn't have to check to see if that object really exists there;
it will simply trust your cast and pretend that there is an object there.

What I meant by "treating it as UB" was in the same vein as, e.g., signed
integer overflow. Compilers generate code today that doesn't work anymore
if it relies/relied on signed integer overflow, although older optimizer
didn't "break" anything.

In the same vein, I'm sure we can imagine several transformations that
break code that has no "placement new" (from my OP) that used (and uses) to
work.

- Martin

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Nicol Bolas

2017-01-16 20:29:02 UTC

I cannot tell you if any discussion has been had. However, ...
... So the only change has been essentially a defect fix that makes
unions actually work, in accord with the standard.
In at least 12 years of standardization, the committee has made no
substantive change to the causes of bringing an object into being. While
this is not conclusive, the fact that C++17 did put a fix into this section
means that they have looked at it and talked about it at some point. So I
would suggest that, if there was discussion about it, it did not progress
beyond discussion.

Thanks a lot for that wrap up!

-snip-

Compilers *do* treat it as UB. UB doesn't mean "crash"; UB can still do
what you want.
The point of the UB designation is to allow implementations to be
reasonably fast. If you reinterpret cast a pointer to a different type, the
compiler doesn't have to check to see if that object really exists there;
it will simply trust your cast and pretend that there is an object there.

What I meant by "treating it as UB" was in the same vein as, e.g., signed
integer overflow. Compilers generate code today that doesn't work anymore
if it relies/relied on signed integer overflow, although older optimizer
didn't "break" anything.
In the same vein, I'm sure we can imagine several transformations that
break code that has no "placement new" (from my OP) that used (and uses) to
work.

Such as?

Assuming a lack of signed integer overflow means that the compiler doesn't
have to insert code to *check* for integer overflow. The UB designation
allows correct code (code without overflows) to execute at maximum
performance. Any degrading of incorrect code is merely a consequence of
making correct code as fast as possible.

Let's say that you have a function that returns a `T*`. The fastest code
generated which uses this return value is code which assumes that `T*`
points to a live, valid object of type `T`. To do anything else makes
correct code slower. Even if you inlined that function or could otherwise
be certain that the `T*` was not valid, that simply means UB happens. Do
you think compiler writers are going to detect such circumstances and make
the code fail in some way?

Can you give an example of these "several transformations"? How would they
speed up correct code?

It should also be noted that, well, we can trace this rule back at least 12
years. Compilers haven't done anything to break such code yet.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Martin Ba

2017-01-16 21:18:30 UTC

I cannot tell you if any discussion has been had. However, ...
... So the only change has been essentially a defect fix that makes
unions actually work, in accord with the standard.
In at least 12 years of standardization, the committee has made no
substantive change to the causes of bringing an object into being. While
this is not conclusive, the fact that C++17 did put a fix into this section
means that they have looked at it and talked about it at some point. So I
would suggest that, if there was discussion about it, it did not progress
beyond discussion.

Thanks a lot for that wrap up!

-snip-

Compilers *do* treat it as UB. UB doesn't mean "crash"; UB can still do
what you want.
The point of the UB designation is to allow implementations to be
reasonably fast. If you reinterpret cast a pointer to a different type, the
compiler doesn't have to check to see if that object really exists there;
it will simply trust your cast and pretend that there is an object there.

What I meant by "treating it as UB" was in the same vein as, e.g., signed
integer overflow. Compilers generate code today that doesn't work anymore
if it relies/relied on signed integer overflow, although older optimizer
didn't "break" anything.
In the same vein, I'm sure we can imagine several transformations that
break code that has no "placement new" (from my OP) that used (and uses) to
work.

See e.g.:
http://stackoverflow.com/questions/7682477/why-does-integer-overflow-on-x86-with-gcc-cause-an-infinite-loop
... "The compiler assumes you won't cause undefined behavior, and optimizes
away the loop test."

Let's say that you have a function that returns a `T*`. The fastest code

Post by Nicol Bolas
generated which uses this return value is code which assumes that `T*`
points to a live, valid object of type `T`. To do anything else makes
correct code slower. Even if you inlined that function or could otherwise
be certain that the `T*` was not valid, that simply means UB happens. Do
you think compiler writers are going to detect such circumstances and make
the code fail in some way?
Can you give an example of these "several transformations"? How would they
speed up correct code?

In the same vein as gcc's -fdelete-null-pointer-checks - (see e.g.
http://stackoverflow.com/questions/23153445/can-branches-with-undefined-behavior-be-assumed-unreachable-and-optimized-as-dea)
the compiler sees a branch that definitiely invokes UB and optimizes away
the branch and the branch check.

It should also be noted that, well, we can trace this rule back at least 12

Post by Nicol Bolas
years. Compilers haven't done anything to break such code yet.

Yet. And I assume (FWIW) as a matter of QoI they won''t. But then, stuff
like -fwrapv and -fno-delete-null-pointer-checks have happened in the sense
that compiler writers saw legal optimization opportunities that break some
code. So, just because I or you cannot see any reason today, that's not
much consolation to me :-)

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Jens Maurer

2017-01-16 21:21:58 UTC

What I meant by "treating it as UB" was in the same vein as, e.g., signed integer overflow. Compilers generate code today that doesn't work anymore if it relies/relied on signed integer overflow, although older optimizer didn't "break" anything.
In the same vein, I'm sure we can imagine several transformations that break code that has no "placement new" (from my OP) that used (and uses) to work.
Such as?

Here's a gentle introduction to undefined behavior vs.
optimizations:

http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html

And don't forget to follow the link to http://blog.regehr.org/archives/213 .

Jens

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Chris Hallock

2017-01-16 21:24:35 UTC

Post by Martin Ba
In the same vein, I'm sure we can imagine several transformations that

Post by Martin Ba
break code that has no "placement new" (from my OP) that used (and uses) to
work.

Such as?

An aggressively-optimizing compiler that assumes perfectly-well-formed C++
input could detect that this code is UB and therefore assume it never
executes (i.e. dead code that can be omitted from the binary).

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

T. C.

2017-01-16 23:14:35 UTC

Post by Martin Ba
In the same vein, I'm sure we can imagine several transformations that
break code that has no "placement new" (from my OP) that used (and uses) to
work.

Such as?

See Richard Smith's comment in this Reddit thread:
https://www.reddit.com/r/cpp/comments/5fk3wn/undefined_behavior_with_reinterpret_cast/dal28n0/
for an example.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Jens Maurer

2017-01-16 21:11:12 UTC

I believe I can say that CWG agrees that the words now in C++17 correctly
reflect the intent that you need the placement new in the case above.

If you believe that intent is misguided, feel free to propose a change.
I'm sure compiler writers will explain to you how that substantially
pessimizes their code generation.

p.s.: (*) is "trivial type" the correct term?
p.p.s.: My personal impression on the matter is that requiring the placement new for trivial types (like int, ...) is rather insane and the amount of real world code compiled with C++ compilers
that would be broken should any C++ compiler/optimizer ever manage to actually treat this as UB is quite huge. 'Course I may be totally off here. Just take this as a disclaimer :-)

Some compilers might make special allowances for their particular user
community, precisely out of concerns you cited. That doesn't make your
code any better.

Jens

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Martin Ba

2017-01-16 21:49:45 UTC

Post by Martin Ba
Hi.
I'm currently trying to understand a few ... interesting ...

observations I have been making wrt. the C++ Standard and using char arrays
as raw storage.

Post by Martin Ba
Essentially, as far as I can tell (have been told), the current C++

Standard only allows using a char array as raw storage (see also
std::aligned_storage) when objects are put into this via placement new,
even for e.g. int or other trivial(*) types.
http://stackoverflow.com/questions/41624685/is-placement-new-legally-required-for-putting-an-int-into-a-char-array

Post by Martin Ba
|alignas(int) char buf[sizeof(int)];
void f() {
// turn the memory into an int: (??) from the POV of the abstract

machine!

Post by Martin Ba
::new (buf) int; // is this strictly required? (aside: it's obviously

a no-op)

Post by Martin Ba
*((int*)buf) = 42; // for this discussion, just assume the cast itself

yields the correct pointer value

Post by Martin Ba
}
What I would be interested in is whether this has been discussed in the

committee (CWG?) in the last very few years

Post by Martin Ba
and whether there is any agreement if omitting the placement new (for

trivial type) should be allowed or if Standard C++ should absolutely
require the placement new./
I believe I can say that CWG agrees that the words now in C++17 correctly
reflect the intent that you need the placement new in the case above.

If this is really the intent, then this needs to be more clearly
communicated and, I feel, rationalized. (Maybe it already has? Thats what
the OP was actually about.)

If you believe that intent is misguided, feel free to propose a change.

Yes, I very much feel the intent is misguided. For two reasons:

- This intent declares UB totally reasonable legacy code. At least I
consider it reasonable too *not* have to place a no-op placement new in
straightforward buffer backed code for trivial types.
- Since C doesn't have placement new, any C code that uses a char buffer
to back any other typed data is automatically UB in C++. Another
unnecessary incompatibility.

All the change I can propose is that CWG considers some way to make this
work. (As it does in practice anyway *today*.) As I understand so far from
what I gleaned from P0137R1 is that the problem we have at the moment is
that the definition for objects (in the memory location sense) doesn't
allow this and that it's pretty complex and hard to come up with something
that does allow it without restricting other things.

I'm sure compiler writers will explain to you how that substantially

pessimizes their code generation.

For this specific case, I do hope not. I'm braced for anything.

Post by Martin Ba
p.s.: (*) is "trivial type" the correct term?
p.p.s.: My personal impression on the matter is that requiring the

placement new for trivial types (like int, ...) is rather insane and the
amount of real world code compiled with C++ compilers

Post by Martin Ba
that would be broken should any C++ compiler/optimizer ever manage to

actually treat this as UB is quite huge. 'Course I may be totally off here.
Just take this as a disclaimer :-)
Some compilers might make special allowances for their particular user
community, precisely out of concerns you cited. That doesn't make your
code any better.

As far as the C++ Standard goes, I'm not so much concerned with "better"
but with not allowing future compilers to break reasonable legacy code.

*When* using char arrays (or malloc'ed memory) as backing store for trivial
types, I fully assume most (non generic) existing code to *not* employ
placement new, simply because it's the straightforward thing to (not) do
and the placement new would be a no-op and all compilers up to today seem
to generate working code.

I think, here, the C++ Standard should take into account this "existing
practice". (Yeah, I know the same arguments were/are raised wrt. signed
integer overflow or the nullpointer-check-elimination, but I at least feel
those cases, while possible problematic in quite some cases, are
historically quite more clear cut. And at least both affect C and C++ code
the same.)

cheers.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Jens Maurer

2017-01-16 22:41:30 UTC

Post by Jens Maurer
I believe I can say that CWG agrees that the words now in C++17 correctly
reflect the intent that you need the placement new in the case above.
If this is really the intent, then this needs to be more clearly communicated

Well, the C++ committee doesn't have a PR department. What's unclear about:

1.8p1 [intro.object]

"The constructs in a C++ program create, destroy, refer to, access, and manipulate objects.
An object is created by a definition (3.1), by a new-expression (5.3.4), when implicitly
changing the active member of a union (9.3), or when a temporary object is created
(4.4, 12.2). ..."

Post by Jens Maurer
and, I feel, rationalized. (Maybe it already has? Thats what the OP was actually about.)

I've reviewed the notes on the CWG discussions for P0137Rx and I could not
find anything that would directly talk about your example.

Post by Jens Maurer
If you believe that intent is misguided, feel free to propose a change.
* This intent declares UB totally reasonable legacy code.

Even legacy code should have used "memcpy" here.

Post by Jens Maurer
At least I consider it reasonable too *not* have to place a no-op placement new in straightforward buffer backed code for trivial types.
* Since C doesn't have placement new, any C code that uses a char buffer to back any other typed data is automatically UB in C++. Another unnecessary incompatibility.
All the change I can propose is that CWG considers some way to make this work. (As it does in practice anyway *today*.) As I understand so far from what I gleaned from P0137R1 is that the problem we have at the moment is that the definition for objects (in the memory location sense) doesn't allow this and that it's pretty complex and hard to come up with something that does allow it without restricting other things.

Well, without a specific proposal on the table for rules that make this
work, but don't detrimentally affect other cases, I'm afraid nothing much
will happen.

Post by Jens Maurer
*When* using char arrays (or malloc'ed memory) as backing store for trivial types, I fully assume most (non generic) existing code to *not* employ placement new, simply because it's the straightforward thing to (not) do and the placement new would be a no-op and all compilers up to today seem to generate working code.

When compilers introduced type-based alias analysis, there was lots of broken code
that could be made to work with -fno-strict-aliasing. The code, eventually, got
fixed. I'm sure people using char arrays as backing store will fix their code
eventually, or learn to live with the shame of -fobjects-spring-to-life eternally.

Post by Jens Maurer
I think, here, the C++ Standard should take into account this "existing practice". (Yeah, I know the same arguments were/are raised wrt. signed integer overflow or the nullpointer-check-elimination, but I at least feel those cases, while possible problematic in quite some cases, are historically quite more clear cut. And at least both affect C and C++ code the same.)

Again, without a proposal, nothing is likely to happen.
(Off-topic: The rules for signed bit-shifts are subtly different between
C and C++, last I looked.)

Jens

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

b***@gmail.com

2017-01-17 14:45:14 UTC

Post by Jens Maurer

Post by Jens Maurer
* This intent declares UB totally reasonable legacy code.

Even legacy code should have used "memcpy" here.

The problem is that, in any latency sensitive application, nobody uses
memcpy here. If we're reading in some binary protocol off the wire,
everybody's code is going to look something like:

switch (*reinterpret_cast<uint16_t const*>(buf)) {
case MsgA::value:
handle(*reinterpret_cast<MsgA const*>(buf);
break;
case MsgB::value:
handle(*reinterpret_cast<MsgB const*>(buf);
break;
// ...
}

instead of:

uint16_t msgType;
memcpy(&msgType, buf, sizeof(msgType));
switch (msgType) {
case MsgA::value: {
MsgA msg;
memcpy(&msg, buf, sizeof(msg));
handle(msg);
break;
}
case MsgB::value: {
MsgB msg;
memcpy(&msg, buf, sizeof(msg));
handle(msg);
break;
}

That's definitely UB now and it was definitely UB before. But writing a memcpy
and hoping that the compiler recognizes that it's really a reinterpret_cast (which
sometimes works, sometimes doesn't) isn't really a solution. Avoiding that
extra write matters.

Maybe we just a:

template <class T, class U>
T* start_lifetime_of_object_without_any_initialization_cast(U*);

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Nicol Bolas

2017-01-17 16:27:20 UTC

Post by Jens Maurer

Post by Jens Maurer
* This intent declares UB totally reasonable legacy code.

Even legacy code should have used "memcpy" here.

The problem is that, in any latency sensitive application, nobody uses
memcpy here. If we're reading in some binary protocol off the wire,
switch (*reinterpret_cast<uint16_t const*>(buf)) {
handle(*reinterpret_cast<MsgA const*>(buf);
break;
handle(*reinterpret_cast<MsgB const*>(buf);
break;
// ...
}
uint16_t msgType;
memcpy(&msgType, buf, sizeof(msgType));
switch (msgType) {
case MsgA::value: {
MsgA msg;
memcpy(&msg, buf, sizeof(msg));
handle(msg);
break;
}
case MsgB::value: {
MsgB msg;
memcpy(&msg, buf, sizeof(msg));
handle(msg);
break;
}
That's definitely UB now and it was definitely UB before. But writing a memcpy
and hoping that the compiler recognizes that it's really a reinterpret_cast
(which sometimes works, sometimes doesn't) isn't really a solution.
Avoiding that extra write matters.
template <class T, class U>
T* start_lifetime_of_object_without_any_initialization_cast(U*);

You already have that. It's called using placement `new` with default
initialization. If `T` is trivially default constructible, then `::new(p)
T` will begin the lifetime of `T` with no initialization.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

b***@gmail.com

2017-01-17 17:22:23 UTC

Post by Nicol Bolas
You already have that. It's called using placement `new` with default
initialization. If `T` is trivially default constructible, then `::new(p)
T` will begin the lifetime of `T` with no initialization.

What if T isn't trivially default constructible?
What if it is, but my compiler decides to "default-initialize" some
fundamental types with fixed values in debug mode?

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Nicol Bolas

2017-01-17 18:10:24 UTC

What if T isn't trivially default constructible?

If a type is not trivially default constructible, then the writer of that
type has explicitly decided that the type *cannot* take on arbitrary
values. Therefore, it can only take on a specific set of values, defined by
the constructors of that type. It can still be trivially copyable, but that
requires you to start from a valid instance of that type, as created by one
of its constructors.

Therefore, whatever construct you want to have that adopts the data in
existing storage *cannot* apply to non-trivially default constructible
types.

What if it is, but my compiler decides to "default-initialize" some

Post by b***@gmail.com
fundamental types with fixed values in debug mode?

... That's a fair point.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

T. C.

2017-01-17 18:17:19 UTC

What if T isn't trivially default constructible?

If a type is not trivially default constructible, then the writer of that
type has explicitly decided that the type *cannot* take on arbitrary
values. Therefore, it can only take on a specific set of values, defined by
the constructors of that type. It can still be trivially copyable, but that
requires you to start from a valid instance of that type, as created by one
of its constructors.
Therefore, whatever construct you want to have that adopts the data in
existing storage *cannot* apply to non-trivially default constructible
types.

Not necessarily. struct X { const int a; }; isn't trivially default
constructible but nothing else you wrote above applies to it.

Post by Nicol Bolas
What if it is, but my compiler decides to "default-initialize" some

Post by b***@gmail.com
fundamental types with fixed values in debug mode?

... That's a fair point.

There are also optimizers that treat the placement new as clobbering the
memory, because not performing initialization for dynamic storage duration
objects results in indeterminate values, not "whatever was there before".

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Nicol Bolas

2017-01-17 19:35:02 UTC

Post by T. C.

What if T isn't trivially default constructible?

If a type is not trivially default constructible, then the writer of that
type has explicitly decided that the type *cannot* take on arbitrary
values. Therefore, it can only take on a specific set of values, defined by
the constructors of that type. It can still be trivially copyable, but that
requires you to start from a valid instance of that type, as created by one
of its constructors.
Therefore, whatever construct you want to have that adopts the data in
existing storage *cannot* apply to non-trivially default constructible
types.

Not necessarily. struct X { const int a; }; isn't trivially default
constructible but nothing else you wrote above applies to it.

Well, I found somethign quite troubling. `X` may not be trivially default
constructible, but it is trivially copyable. I have no idea how that is
even possible, since it effectively means you can change a `const` member
of a live object.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Ville Voutilainen

2017-01-17 20:26:01 UTC

Post by Nicol Bolas
Well, I found somethign quite troubling. `X` may not be trivially default
constructible, but it is trivially copyable. I have no idea how that is even
possible, since it effectively means you can change a `const` member of a
live object.

I don't see how that follows from the current rules. You can end the
lifetime of the X object
and create a new one in the same location, but I don't see how you can
change a const member
of a live object.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Nicol Bolas

2017-01-17 20:53:39 UTC

Post by Nicol Bolas
Well, I found somethign quite troubling. `X` may not be trivially

default

Post by Nicol Bolas
constructible, but it is trivially copyable. I have no idea how that is

even

Post by Nicol Bolas
possible, since it effectively means you can change a `const` member of

Post by Nicol Bolas
live object.

X x1 = {5};
X x2 = {10};

memcpy(&x1, &x2, sizeof(X));

[basic.types]/3 tells us that this is perfectly legal. And it tells us
"obj2 shall subsequently hold the same value as obj1". It does not say that
`obj2` will be destroyed and constructed, or that `obj2`'s storage will be
reused for a new instance of `X`. It says that it holds the same value as
`obj1`.

That means `x1.a` *must* be 10. Which means we have changed a `const`
object.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Ville Voutilainen

2017-01-17 21:01:23 UTC

X x1 = {5};
X x2 = {10};
memcpy(&x1, &x2, sizeof(X));
[basic.types]/3 tells us that this is perfectly legal. And it tells us "obj2
shall subsequently hold the same value as obj1". It does not say that `obj2`
will be destroyed and constructed, or that `obj2`'s storage will be reused
for a new instance of `X`. It says that it holds the same value as `obj1`.
That means `x1.a` must be 10. Which means we have changed a `const` object.

You seem to be pretending that [dcl.type.cv]/4 doesn't exist.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Nicol Bolas

2017-01-17 21:19:40 UTC

On Tuesday, January 17, 2017 at 3:26:02 PM UTC-5, Ville Voutilainen

Post by Nicol Bolas
Well, I found somethign quite troubling. `X` may not be trivially default
constructible, but it is trivially copyable. I have no idea how that

Post by Nicol Bolas
even
possible, since it effectively means you can change a `const` member

Post by Nicol Bolas
a
live object.

X x1 = {5};
X x2 = {10};
memcpy(&x1, &x2, sizeof(X));
[basic.types]/3 tells us that this is perfectly legal. And it tells us

"obj2

shall subsequently hold the same value as obj1". It does not say that

`obj2`

will be destroyed and constructed, or that `obj2`'s storage will be

reused

for a new instance of `X`. It says that it holds the same value as

`obj1`.

That means `x1.a` must be 10. Which means we have changed a `const`

object.
You seem to be pretending that [dcl.type.cv]/4 doesn't exist.

If trivially copying objects with `const` members provokes UB, why are
objects with `const` members trivially copyable? Or is the idea that, since
such types are not assignable, you shouldn't be trying to use trivial copy
mechanics to assign to them?

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Ville Voutilainen

2017-01-17 21:22:11 UTC

Post by Nicol Bolas
If trivially copying objects with `const` members provokes UB, why are
objects with `const` members trivially copyable? Or is the idea that, since
such types are not assignable, you shouldn't be trying to use trivial copy
mechanics to assign to them?

The latter, yes. You can still bit-blast them into buffers that don't
contain live objects yet,
so for that reason it's apparently rather important that such types
are trivially copyable,
but trivially copyable doesn't necessarily mean assignable or "can
blast values over existing objects".

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

b***@gmail.com

2017-01-17 22:26:57 UTC

Post by Ville Voutilainen
The latter, yes. You can still bit-blast them into buffers that don't
contain live objects yet,
so for that reason it's apparently rather important that such types
are trivially copyable,
but trivially copyable doesn't necessarily mean assignable or "can
blast values over existing objects".

I can bit-blast it into a buffer. But how can I bit-blast it out of a
buffer?

struct X { const int val; };

// this is all well and good
alignas(X) char buffer[sizeof(X)];
new (buffer) X{42};
::send(buffer, sizeof(buffer));

alignas(X) char recv_buffer[sizeof(X)];
::recv(recv_buffer, sizeof(recv_buffer));

// can't do this
X x; // nope
memcpy(&a, recv_buffer, sizeof(a));

// this is UB
X b(*reinterpret_cast<X const*>(recv_buffer));

// this is well-defined yet totally unmaintainable
int v;
memcpy(&v, recv_buffer, sizeof(v));
X c{v};

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Nicol Bolas

2017-01-18 15:31:08 UTC

Now there's an interesting idea for a standard library function:

auto x = std::trivial_copy_construct<X>(recv_buffer);

Obviously, it can only take `X` types which are trivially copyable and copy
constructible. It returns a prvalue `X` (praise be to guaranteed elision).

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Thiago Macieira

2017-01-17 21:46:33 UTC

struct X { const int a; };

Post by Nicol Bolas
X x1 = {5};
X x2 = {10};
memcpy(&x1, &x2, sizeof(X));
[basic.types]/3 tells us that this is perfectly legal. And it tells us
"obj2 shall subsequently hold the same value as obj1". It does not say that
`obj2` will be destroyed and constructed, or that `obj2`'s storage will be
reused for a new instance of `X`. It says that it holds the same value as
`obj1`.
That means `x1.a` *must* be 10. Which means we have changed a `const`
object.

12.8.2 [class.copy.assign] / 7 says the copy-assignment operator is deleted if
any non-static member is const. That's why the expression:

x1 = x2;

fails to compile. But [class]/6 still allows it to be trivially copyable,
since all copy/move constructors and assignment operators are either trivial
(both constructors) or deleted (both assignment operators), but still has at
least one of them non-deleted.

That means the memcpy above is not a copy-assignment. It's actually destroying
the old object and initialising a new one. That is, it's equivalent to:

x1.~X();
new (&x1) X(x2);

Which is perfectly valid, even in the context of const members.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Richard Smith

2017-01-17 00:23:08 UTC

Post by Martin Ba
Hi.
I'm currently trying to understand a few ... interesting ...

observations I have been making wrt. the C++ Standard and using char arrays
as raw storage.

Post by Martin Ba
Essentially, as far as I can tell (have been told), the current C++

Standard only allows using a char array as raw storage (see also
std::aligned_storage) when objects are put into this via placement new,
even for e.g. int or other trivial(*) types.

Post by Martin Ba
See: http://stackoverflow.com/questions/41624685/is-placement-

new-legally-required-for-putting-an-int-into-a-char-array or related

Post by Martin Ba
|alignas(int) char buf[sizeof(int)];
void f() {
// turn the memory into an int: (??) from the POV of the abstract

machine!

Post by Martin Ba
::new (buf) int; // is this strictly required? (aside: it's obviously

a no-op)

Post by Martin Ba
*((int*)buf) = 42; // for this discussion, just assume the cast

itself yields the correct pointer value

Post by Martin Ba
}
What I would be interested in is whether this has been discussed in the

committee (CWG?) in the last very few years

Post by Martin Ba
and whether there is any agreement if omitting the placement new (for

If this is really the intent, then this needs to be more clearly
communicated and, I feel, rationalized. (Maybe it already has? Thats what
the OP was actually about.)

If you believe that intent is misguided, feel free to propose a change.

- This intent declares UB totally reasonable legacy code. At least I
consider it reasonable too *not* have to place a no-op placement new in
straightforward buffer backed code for trivial types.
- Since C doesn't have placement new, any C code that uses a char
buffer to back any other typed data is automatically UB in C++. Another
unnecessary incompatibility.
The above code has undefined behavior in C too. C's effective type rules

do not permit changing the effective type of a declared object to something
other than its declared type; it only permits that for objects allocated
with malloc or similar.

In the case where the storage /was/ allocated through malloc or similar,
C++ requires a placement new where C simply allows the effective type to
change through a store (and some parts of the C effective type model don't
work as a result...). It would seem reasonable to me for such allocation
functions to be specified to have implicitly created whatever set of
objects the following code relies on existing[1] -- the compiler typically
has to make that pessimistic assumption anyway, since it doesn't know what
objects the implementation of an opaque function might create, so it seems
like we'd lose little and gain more C compatibility by guaranteeing
something like that.

[1]: that is, we could require the compiler to assume that malloc runs a
sequence of placement news (for types with trivial default construction and
destruction) before it returns, where that set is chosen to be whatever set
gives the program defined behavior -- if such a set exists

Post by Martin Ba
All the change I can propose is that CWG considers some way to make this
work. (As it does in practice anyway *today*.) As I understand so far from
what I gleaned from P0137R1 is that the problem we have at the moment is
that the definition for objects (in the memory location sense) doesn't
allow this and that it's pretty complex and hard to come up with something
that does allow it without restricting other things.
I'm sure compiler writers will explain to you how that substantially

pessimizes their code generation.

For this specific case, I do hope not. I'm braced for anything.

Post by Martin Ba
p.s.: (*) is "trivial type" the correct term?
p.p.s.: My personal impression on the matter is that requiring the

placement new for trivial types (like int, ...) is rather insane and the
amount of real world code compiled with C++ compilers

Post by Martin Ba
that would be broken should any C++ compiler/optimizer ever manage to

As far as the C++ Standard goes, I'm not so much concerned with "better"
but with not allowing future compilers to break reasonable legacy code.
*When* using char arrays (or malloc'ed memory) as backing store for
trivial types, I fully assume most (non generic) existing code to *not*
employ placement new, simply because it's the straightforward thing to
(not) do and the placement new would be a no-op and all compilers up to
today seem to generate working code.
I think, here, the C++ Standard should take into account this "existing
practice". (Yeah, I know the same arguments were/are raised wrt. signed
integer overflow or the nullpointer-check-elimination, but I at least feel
those cases, while possible problematic in quite some cases, are
historically quite more clear cut. And at least both affect C and C++ code
the same.)
cheers.
--
---
You received this message because you are subscribed to the Google Groups
"ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/a/isocpp.org/group/std-
discussion/.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Nicol Bolas

2017-01-17 01:47:36 UTC

Post by Martin Ba
Hi.
I'm currently trying to understand a few ... interesting ...

observations I have been making wrt. the C++ Standard and using char arrays
as raw storage.

Post by Martin Ba
Essentially, as far as I can tell (have been told), the current C++

Post by Martin Ba
|alignas(int) char buf[sizeof(int)];
void f() {
// turn the memory into an int: (??) from the POV of the abstract

machine!

Post by Martin Ba
::new (buf) int; // is this strictly required? (aside: it's

obviously a no-op)

Post by Martin Ba
*((int*)buf) = 42; // for this discussion, just assume the cast

itself yields the correct pointer value

Post by Martin Ba
}
What I would be interested in is whether this has been discussed in

the committee (CWG?) in the last very few years

Post by Martin Ba
and whether there is any agreement if omitting the placement new (for

If this is really the intent, then this needs to be more clearly
communicated and, I feel, rationalized. (Maybe it already has? Thats what
the OP was actually about.)

If you believe that intent is misguided, feel free to propose a change.

do not permit changing the effective type of a declared object to something
other than its declared type; it only permits that for objects allocated
with malloc or similar.
In the case where the storage /was/ allocated through malloc or similar,
C++ requires a placement new where C simply allows the effective type to
change through a store (and some parts of the C effective type model don't
work as a result...). It would seem reasonable to me for such allocation
functions to be specified to have implicitly created whatever set of
objects the following code relies on existing[1] -- the compiler typically
has to make that pessimistic assumption anyway, since it doesn't know what
objects the implementation of an opaque function might create, so it seems
like we'd lose little and gain more C compatibility by guaranteeing
something like that.
[1]: that is, we could require the compiler to assume that malloc runs a
sequence of placement news (for types with trivial default construction and
destruction) before it returns, where that set is chosen to be whatever set
gives the program defined behavior -- if such a set exists

The result of a "sequence of placement news" on a piece of memory is the
creation of an object of the last type `new`ed. The C++ object model does
not permit storage to have an indeterminate object or many separate objects
(outside of nesting). If you allocate 4 bytes and new an `int` into it,
then it is an int. If you new a `float` into it, it stops being an `int`.

So, how would you suggest the object model change to accommodate such a
thing? What is the syntax that causes a piece of storage that contains all
objects to contain just one?

Personally? I say let it go. C++ programmers have managed to survive this
being UB since at least 2004. We're teaching C++ programmers nowadays to
avoid pointless casting; the average C++ programmer today is far more
likely to employ placement-new than to do casts and assume it was
constructed.

I'd rather the committee spend time shoring up the object model for genuine
C++ purposes, like making it possible for `vector` to be implemented
without UB.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Richard Smith

2017-01-17 02:04:49 UTC

Post by Martin Ba
Hi.
I'm currently trying to understand a few ... interesting ...

observations I have been making wrt. the C++ Standard and using char arrays
as raw storage.

Post by Martin Ba
Essentially, as far as I can tell (have been told), the current C++

Standard only allows using a char array as raw storage (see also
std::aligned_storage) when objects are put into this via placement new,
even for e.g. int or other trivial(*) types.

Post by Martin Ba
See: http://stackoverflow.com/questions/41624685/is-placement-

new-legally-required-for-putting-an-int-into-a-char-array or related

Post by Martin Ba
|alignas(int) char buf[sizeof(int)];
void f() {
// turn the memory into an int: (??) from the POV of the abstract

machine!

Post by Martin Ba
::new (buf) int; // is this strictly required? (aside: it's

obviously a no-op)

Post by Martin Ba
*((int*)buf) = 42; // for this discussion, just assume the cast

itself yields the correct pointer value

Post by Martin Ba
}
What I would be interested in is whether this has been discussed in

the committee (CWG?) in the last very few years

Post by Martin Ba
and whether there is any agreement if omitting the placement new (for

If this is really the intent, then this needs to be more clearly
communicated and, I feel, rationalized. (Maybe it already has? Thats what
the OP was actually about.)

If you believe that intent is misguided, feel free to propose a change.

do not permit changing the effective type of a declared object to something
other than its declared type; it only permits that for objects allocated
with malloc or similar.
In the case where the storage /was/ allocated through malloc or similar,
C++ requires a placement new where C simply allows the effective type to
change through a store (and some parts of the C effective type model don't
work as a result...). It would seem reasonable to me for such allocation
functions to be specified to have implicitly created whatever set of
objects the following code relies on existing[1] -- the compiler typically
has to make that pessimistic assumption anyway, since it doesn't know what
objects the implementation of an opaque function might create, so it seems
like we'd lose little and gain more C compatibility by guaranteeing
something like that.
[1]: that is, we could require the compiler to assume that malloc runs a
sequence of placement news (for types with trivial default construction and
destruction) before it returns, where that set is chosen to be whatever set
gives the program defined behavior -- if such a set exists

I never said they would all be at the start of the allocation.

So, how would you suggest the object model change to accommodate such a

Post by Nicol Bolas
thing? What is the syntax that causes a piece of storage that contains all
objects to contain just one?
Personally? I say let it go. C++ programmers have managed to survive this
being UB since at least 2004. We're teaching C++ programmers nowadays to
avoid pointless casting; the average C++ programmer today is far more
likely to employ placement-new than to do casts and assume it was
constructed.
I'd rather the committee spend time shoring up the object model for
genuine C++ purposes, like making it possible for `vector` to be
implemented without UB.
--
---
You received this message because you are subscribed to the Google Groups
"ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/a/isocpp.org/group/std-
discussion/.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Nicol Bolas

2017-01-17 02:25:25 UTC

Post by Martin Ba
Hi.
I'm currently trying to understand a few ... interesting ...

observations I have been making wrt. the C++ Standard and using char arrays
as raw storage.

Post by Martin Ba
Essentially, as far as I can tell (have been told), the current C++

Post by Martin Ba
|alignas(int) char buf[sizeof(int)];
void f() {
// turn the memory into an int: (??) from the POV of the abstract

machine!

Post by Martin Ba
::new (buf) int; // is this strictly required? (aside: it's

obviously a no-op)

Post by Martin Ba
*((int*)buf) = 42; // for this discussion, just assume the cast

itself yields the correct pointer value

Post by Martin Ba
}
What I would be interested in is whether this has been discussed in

the committee (CWG?) in the last very few years

Post by Martin Ba
and whether there is any agreement if omitting the placement new

(for trivial type) should be allowed or if Standard C++ should absolutely
require the placement new./
I believe I can say that CWG agrees that the words now in C++17 correctly
reflect the intent that you need the placement new in the case above.

If this is really the intent, then this needs to be more clearly
communicated and, I feel, rationalized. (Maybe it already has? Thats what
the OP was actually about.)

If you believe that intent is misguided, feel free to propose a change.

- This intent declares UB totally reasonable legacy code. At least
I consider it reasonable too *not* have to place a no-op placement new in
straightforward buffer backed code for trivial types.
- Since C doesn't have placement new, any C code that uses a char
buffer to back any other typed data is automatically UB in C++. Another
unnecessary incompatibility.
The above code has undefined behavior in C too. C's effective type

rules do not permit changing the effective type of a declared object to
something other than its declared type; it only permits that for objects
allocated with malloc or similar.
In the case where the storage /was/ allocated through malloc or similar,
C++ requires a placement new where C simply allows the effective type to
change through a store (and some parts of the C effective type model don't
work as a result...). It would seem reasonable to me for such allocation
functions to be specified to have implicitly created whatever set of
objects the following code relies on existing[1] -- the compiler typically
has to make that pessimistic assumption anyway, since it doesn't know what
objects the implementation of an opaque function might create, so it seems
like we'd lose little and gain more C compatibility by guaranteeing
something like that.
[1]: that is, we could require the compiler to assume that malloc runs
a sequence of placement news (for types with trivial default construction
and destruction) before it returns, where that set is chosen to be whatever
set gives the program defined behavior -- if such a set exists

I never said they would all be at the start of the allocation.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Richard Smith

2017-01-17 04:09:56 UTC

Post by Martin Ba
Hi.
I'm currently trying to understand a few ... interesting ...

observations I have been making wrt. the C++ Standard and using char arrays
as raw storage.

Post by Martin Ba
Essentially, as far as I can tell (have been told), the current C++

Standard only allows using a char array as raw storage (see also
std::aligned_storage) when objects are put into this via placement new,
even for e.g. int or other trivial(*) types.

Post by Martin Ba
See: http://stackoverflow.com/questions/41624685/is-placement-

new-legally-required-for-putting-an-int-into-a-char-array or related

Post by Martin Ba
|alignas(int) char buf[sizeof(int)];
void f() {
// turn the memory into an int: (??) from the POV of the abstract

machine!

Post by Martin Ba
::new (buf) int; // is this strictly required? (aside: it's

obviously a no-op)

Post by Martin Ba
*((int*)buf) = 42; // for this discussion, just assume the cast

itself yields the correct pointer value

Post by Martin Ba
}
What I would be interested in is whether this has been discussed in

the committee (CWG?) in the last very few years

Post by Martin Ba
and whether there is any agreement if omitting the placement new

If this is really the intent, then this needs to be more clearly
communicated and, I feel, rationalized. (Maybe it already has? Thats what
the OP was actually about.)

If you believe that intent is misguided, feel free to propose a change.

- This intent declares UB totally reasonable legacy code. At least
I consider it reasonable too *not* have to place a no-op placement new in
straightforward buffer backed code for trivial types.
- Since C doesn't have placement new, any C code that uses a char
buffer to back any other typed data is automatically UB in C++. Another
unnecessary incompatibility.
The above code has undefined behavior in C too. C's effective type

I never said they would all be at the start of the allocation.

... that doesn't make sense. I mean, where else are they going to be
*except* for the start? If I allocate 4 bytes, then you need to `new` up
both `int` and `float` (assuming they're both 4 bytes, of course). But
there's no room to `new` them at different addresses within that
allocation, since the allocation is only 4 bytes.

I don't know what this example is supposed to demonstrate.

So where would you be allocating these different objects?

If you allocate 8 bytes, there could be an int object at offset 0 and a
float object at offset 4.
--
---
You received this message because you are subscribed to the Google Groups
"ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-
discussion/.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Nicol Bolas

2017-01-17 13:31:48 UTC

Post by Martin Ba
Hi.
I'm currently trying to understand a few ... interesting ...

observations I have been making wrt. the C++ Standard and using char arrays
as raw storage.

Post by Martin Ba
Essentially, as far as I can tell (have been told), the current C++

Post by Martin Ba
|alignas(int) char buf[sizeof(int)];
void f() {
// turn the memory into an int: (??) from the POV of the abstract

machine!

Post by Martin Ba
::new (buf) int; // is this strictly required? (aside: it's

obviously a no-op)

Post by Martin Ba
*((int*)buf) = 42; // for this discussion, just assume the cast

itself yields the correct pointer value

Post by Martin Ba
}
What I would be interested in is whether this has been discussed in

the committee (CWG?) in the last very few years

Post by Martin Ba
and whether there is any agreement if omitting the placement new

If this is really the intent, then this needs to be more clearly
communicated and, I feel, rationalized. (Maybe it already has? Thats what
the OP was actually about.)

If you believe that intent is misguided, feel free to propose a change.

- This intent declares UB totally reasonable legacy code. At least
I consider it reasonable too *not* have to place a no-op placement new in
straightforward buffer backed code for trivial types.
- Since C doesn't have placement new, any C code that uses a char
buffer to back any other typed data is automatically UB in C++. Another
unnecessary incompatibility.
The above code has undefined behavior in C too. C's effective type

rules do not permit changing the effective type of a declared object to
something other than its declared type; it only permits that for objects
allocated with malloc or similar.
In the case where the storage /was/ allocated through malloc or
similar, C++ requires a placement new where C simply allows the effective
type to change through a store (and some parts of the C effective type
model don't work as a result...). It would seem reasonable to me for such
allocation functions to be specified to have implicitly created whatever
set of objects the following code relies on existing[1] -- the compiler
typically has to make that pessimistic assumption anyway, since it doesn't
know what objects the implementation of an opaque function might create, so
it seems like we'd lose little and gain more C compatibility by
guaranteeing something like that.
[1]: that is, we could require the compiler to assume that malloc runs
a sequence of placement news (for types with trivial default construction
and destruction) before it returns, where that set is chosen to be whatever
set gives the program defined behavior -- if such a set exists

I never said they would all be at the start of the allocation.

placement news (for types with trivial default construction and
destruction) before it returns, where that set is chosen to be whatever set
gives the program defined behavior -- if such a set exists

Therefore, if I 'malloc' 4 bytes of storage, then placement `new` will be
executed on that storage for both `int` and `float`. Among others. *That's
what you're asking for*.

And as I said, that would make the memory both an `int` and a `float` at

Post by Nicol Bolas
I never said they would all be at the start of the allocation.

Then where is it going to be? Where does the `int` get created and where
does the `float` get created, since there's not room enough for both?

I'm trying to understand what you're suggesting the standard do here, and
thus far, it does not make sense.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Richard Smith

2017-01-17 22:46:28 UTC

Post by Martin Ba
Hi.
I'm currently trying to understand a few ... interesting ...

observations I have been making wrt. the C++ Standard and using char arrays
as raw storage.

Post by Martin Ba
Essentially, as far as I can tell (have been told), the current

C++ Standard only allows using a char array as raw storage (see also
std::aligned_storage) when objects are put into this via placement new,
even for e.g. int or other trivial(*) types.

Post by Martin Ba
See: http://stackoverflow.com/questions/41624685/is-placement-

new-legally-required-for-putting-an-int-into-a-char-array or

Post by Martin Ba
|alignas(int) char buf[sizeof(int)];
void f() {
// turn the memory into an int: (??) from the POV of the

abstract machine!

Post by Martin Ba
::new (buf) int; // is this strictly required? (aside: it's

obviously a no-op)

Post by Martin Ba
*((int*)buf) = 42; // for this discussion, just assume the cast

itself yields the correct pointer value

Post by Martin Ba
}
What I would be interested in is whether this has been discussed

in the committee (CWG?) in the last very few years

Post by Martin Ba
and whether there is any agreement if omitting the placement new

If this is really the intent, then this needs to be more clearly
communicated and, I feel, rationalized. (Maybe it already has? Thats what
the OP was actually about.)

If you believe that intent is misguided, feel free to propose a change.

- This intent declares UB totally reasonable legacy code. At
least I consider it reasonable too *not* have to place a no-op placement
new in straightforward buffer backed code for trivial types.
- Since C doesn't have placement new, any C code that uses a char
buffer to back any other typed data is automatically UB in C++. Another
unnecessary incompatibility.
The above code has undefined behavior in C too. C's effective type

rules do not permit changing the effective type of a declared object to
something other than its declared type; it only permits that for objects
allocated with malloc or similar.
In the case where the storage /was/ allocated through malloc or
similar, C++ requires a placement new where C simply allows the effective
type to change through a store (and some parts of the C effective type
model don't work as a result...). It would seem reasonable to me for such
allocation functions to be specified to have implicitly created whatever
set of objects the following code relies on existing[1] -- the compiler
typically has to make that pessimistic assumption anyway, since it doesn't
know what objects the implementation of an opaque function might create, so
it seems like we'd lose little and gain more C compatibility by
guaranteeing something like that.
[1]: that is, we could require the compiler to assume that malloc
runs a sequence of placement news (for types with trivial default
construction and destruction) before it returns, where that set is chosen
to be whatever set gives the program defined behavior -- if such a set
exists

I never said they would all be at the start of the allocation.

placement news (for types with trivial default construction and
destruction) before it returns, where that set is chosen to be whatever set
gives the program defined behavior -- if such a set exists
Therefore, if I 'malloc' 4 bytes of storage, then placement `new` will be
executed on that storage for both `int` and `float`. Among others. *That's
what you're asking for*.

No, it's not. If the program does this:

void *p = malloc(4);
int *n = (int*)p;
*n = 0;
float *f = (float*)p;
*f = 0;

then there is no set of placement new operations that malloc could have
performed that result in this program being valid, so the rule I described
does not place requirements on the behavior of this program.

But if the program does this:

void *p = malloc(8);
int *n = (int*)p;
*n = 0;
float *f = (float*)((char*)p + 4);
*f = 0;

... then that would be valid if malloc created an int at offset 0 and a
float at offset 4 (plus an array of chars covering the whole array to make
the pointer arithmetic valid).

And as I said, that would make the memory both an `int` and a `float` at

Post by Nicol Bolas
I never said they would all be at the start of the allocation.

Then where is it going to be? Where does the `int` get created and where
does the `float` get created, since there's not room enough for both?

Not all allocations are of 4 bytes. You seem to be saying that because this
can't happen in one particular situation, it can't happen in any situation.
That obviously doesn't follow.

Post by Richard Smith
I'm trying to understand what you're suggesting the standard do here, and
thus far, it does not make sense.
--
---
You received this message because you are subscribed to the Google Groups
"ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/a/isocpp.org/group/std-
discussion/.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Richard Smith

2017-01-17 22:50:02 UTC

Post by Martin Ba
Hi.
I'm currently trying to understand a few ... interesting ...

observations I have been making wrt. the C++ Standard and using char arrays
as raw storage.

Post by Martin Ba
Essentially, as far as I can tell (have been told), the current

C++ Standard only allows using a char array as raw storage (see also
std::aligned_storage) when objects are put into this via placement new,
even for e.g. int or other trivial(*) types.

Post by Martin Ba
See: http://stackoverflow.com/questions/41624685/is-placement-new

-legally-required-for-putting-an-int-into-a-char-array or related

Post by Martin Ba
|alignas(int) char buf[sizeof(int)];
void f() {
// turn the memory into an int: (??) from the POV of the

abstract machine!

Post by Martin Ba
::new (buf) int; // is this strictly required? (aside: it's

obviously a no-op)

Post by Martin Ba
*((int*)buf) = 42; // for this discussion, just assume the cast

itself yields the correct pointer value

Post by Martin Ba
}
What I would be interested in is whether this has been discussed

in the committee (CWG?) in the last very few years

Post by Martin Ba
and whether there is any agreement if omitting the placement new

If this is really the intent, then this needs to be more clearly
communicated and, I feel, rationalized. (Maybe it already has? Thats what
the OP was actually about.)

If you believe that intent is misguided, feel free to propose a change.

- This intent declares UB totally reasonable legacy code. At
least I consider it reasonable too *not* have to place a no-op placement
new in straightforward buffer backed code for trivial types.
- Since C doesn't have placement new, any C code that uses a
char buffer to back any other typed data is automatically UB in C++.
Another unnecessary incompatibility.
The above code has undefined behavior in C too. C's effective type

rules do not permit changing the effective type of a declared object to
something other than its declared type; it only permits that for objects
allocated with malloc or similar.
In the case where the storage /was/ allocated through malloc or
similar, C++ requires a placement new where C simply allows the effective
type to change through a store (and some parts of the C effective type
model don't work as a result...). It would seem reasonable to me for such
allocation functions to be specified to have implicitly created whatever
set of objects the following code relies on existing[1] -- the compiler
typically has to make that pessimistic assumption anyway, since it doesn't
know what objects the implementation of an opaque function might create, so
it seems like we'd lose little and gain more C compatibility by
guaranteeing something like that.
[1]: that is, we could require the compiler to assume that malloc
runs a sequence of placement news (for types with trivial default
construction and destruction) before it returns, where that set is chosen
to be whatever set gives the program defined behavior -- if such a set
exists

I never said they would all be at the start of the allocation.

placement news (for types with trivial default construction and
destruction) before it returns, where that set is chosen to be whatever set
gives the program defined behavior -- if such a set exists
Therefore, if I 'malloc' 4 bytes of storage, then placement `new` will be
executed on that storage for both `int` and `float`. Among others. *That's
what you're asking for*.

void *p = malloc(4);
int *n = (int*)p;
*n = 0;
float *f = (float*)p;
*f = 0;
then there is no set of placement new operations that malloc could have
performed that result in this program being valid, so the rule I described
does not place requirements on the behavior of this program.
void *p = malloc(8);
int *n = (int*)p;
*n = 0;
float *f = (float*)((char*)p + 4);
*f = 0;
... then that would be valid if malloc created an int at offset 0 and a
float at offset 4 (plus an array of chars covering the whole array to make
the pointer arithmetic valid).

(This doesn't quite work because the "provides storage" rule doesn't permit
an array of plain char to provide storage for other objects, but I'm
increasingly thinking that's a mistake.)

Post by Richard Smith
And as I said, that would make the memory both an `int` and a `float` at

Post by Nicol Bolas
I never said they would all be at the start of the allocation.

Then where is it going to be? Where does the `int` get created and where
does the `float` get created, since there's not room enough for both?

Not all allocations are of 4 bytes. You seem to be saying that because
this can't happen in one particular situation, it can't happen in any
situation. That obviously doesn't follow.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Richard Hodges

2017-01-17 23:17:30 UTC

Post by Richard Smith
(This doesn't quite work because the "provides storage" rule doesn't

permit an array of plain char to provide storage for other objects, but I'm
increasingly thinking that's a mistake.)

The mistake was made when the committee decided that c++ was not a low
level language that could deliberately do low level things legally.

Clearly some young zealot thought that c++ was too useful, and the
standards committee needed to be distracted from doing useful work, like
providing ranges, proper threading, async networking, http (for goodness'
sake! it's 20 years old!!!), graphics and other useful libraries.

They seem to have done that by turning the committee into some kind of
theoretical computing debating society.

It's time the committee woke up and brought useful, practical improvements
to the language, rather than ridiculous meaningless rules that just get in
the way of expressing obvious logic.

Lambdas were a good start, and some kind of formalisation on threading was
useful, copy elision is great...

But the threading library is garbage. Un-useable for anything serious. We
still have to reach for boost, or TBB, or some other third party crutch.
It's embarrassing, and a huge inconvenience when cross-compiling for linux,
ios, android, etc as I and people like me do.

For a language that's geared for correctness and performance, this is
disgraceful.

Someone needs to go through the committee with a stiff broom.

Post by Martin Ba
Hi.
I'm currently trying to understand a few ... interesting ...

observations I have been making wrt. the C++ Standard and using char arrays
as raw storage.

Post by Martin Ba
Essentially, as far as I can tell (have been told), the current

C++ Standard only allows using a char array as raw storage (see also
std::aligned_storage) when objects are put into this via placement new,
even for e.g. int or other trivial(*) types.

Post by Martin Ba
See: http://stackoverflow.com/quest

ions/41624685/is-placement-new-legally-required-for-putting-
an-int-into-a-char-array or related questions where I'm told I'm

Post by Martin Ba
|alignas(int) char buf[sizeof(int)];
void f() {
// turn the memory into an int: (??) from the POV of the

abstract machine!

Post by Martin Ba
::new (buf) int; // is this strictly required? (aside: it's

obviously a no-op)

Post by Martin Ba
*((int*)buf) = 42; // for this discussion, just assume the

cast itself yields the correct pointer value

Post by Martin Ba
}
What I would be interested in is whether this has been discussed

in the committee (CWG?) in the last very few years

Post by Martin Ba
and whether there is any agreement if omitting the placement new

If this is really the intent, then this needs to be more clearly
communicated and, I feel, rationalized. (Maybe it already has? Thats what
the OP was actually about.)

If you believe that intent is misguided, feel free to propose a change.

- This intent declares UB totally reasonable legacy code. At
least I consider it reasonable too *not* have to place a no-op placement
new in straightforward buffer backed code for trivial types.
- Since C doesn't have placement new, any C code that uses a
char buffer to back any other typed data is automatically UB in C++.
Another unnecessary incompatibility.
The above code has undefined behavior in C too. C's effective type

rules do not permit changing the effective type of a declared object to
something other than its declared type; it only permits that for objects
allocated with malloc or similar.
In the case where the storage /was/ allocated through malloc or
similar, C++ requires a placement new where C simply allows the effective
type to change through a store (and some parts of the C effective type
model don't work as a result...). It would seem reasonable to me for such
allocation functions to be specified to have implicitly created whatever
set of objects the following code relies on existing[1] -- the compiler
typically has to make that pessimistic assumption anyway, since it doesn't
know what objects the implementation of an opaque function might create, so
it seems like we'd lose little and gain more C compatibility by
guaranteeing something like that.
[1]: that is, we could require the compiler to assume that malloc
runs a sequence of placement news (for types with trivial default
construction and destruction) before it returns, where that set is chosen
to be whatever set gives the program defined behavior -- if such a set
exists

I never said they would all be at the start of the allocation.

placement news (for types with trivial default construction and
destruction) before it returns, where that set is chosen to be whatever set
gives the program defined behavior -- if such a set exists
Therefore, if I 'malloc' 4 bytes of storage, then placement `new` will
be executed on that storage for both `int` and `float`. Among others. *That's
what you're asking for*.

(This doesn't quite work because the "provides storage" rule doesn't
permit an array of plain char to provide storage for other objects, but I'm
increasingly thinking that's a mistake.)

Post by Richard Smith
And as I said, that would make the memory both an `int` and a `float` at

Post by Nicol Bolas
I never said they would all be at the start of the allocation.

Then where is it going to be? Where does the `int` get created and where
does the `float` get created, since there's not room enough for both?

Not all allocations are of 4 bytes. You seem to be saying that because
this can't happen in one particular situation, it can't happen in any
situation. That obviously doesn't follow.

Post by Richard Smith
I'm trying to understand what you're suggesting the standard do here,
and thus far, it does not make sense.
--
---
You received this message because you are subscribed to the Google
Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send
Visit this group at https://groups.google.com/a/is
ocpp.org/group/std-discussion/.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Thiago Macieira

2017-01-18 04:48:05 UTC

Post by Richard Hodges
The mistake was made when the committee decided that c++ was not a low
level language that could deliberately do low level things legally.
Clearly some young zealot thought that c++ was too useful, and the
standards committee needed to be distracted from doing useful work, like
providing ranges, proper threading, async networking, http (for goodness'
sake! it's 20 years old!!!), graphics and other useful libraries.

In the committee's defence and noting that I am not a member of the committee,
the rule that we're talking about here was in the original language of the
standard, in C++98, and is partially shared with C: the strict aliasing rule.
The reason we're talking about it now is that compilers have become a lot
smarter in the past 10 years and have begun using the UBs to optimise code.
Plus there was some well-intended clarification added to C++17 that apparently
had side-effects.

As for the other things that you think are useful, they need to be built layer
by layer. In order to have HTTP support, we need to have support for sockets,
threading asynchronous operations, and manipulating URIs. In turn, in order to
have support for sockets, we need primitives for socket addresses and at least
a modicum of SSL/TLS control. Probably other things too.

Some of what I mentioned has papers in the standardisation track. Some others
have ideas. Some others are still unaddressed. But we are building it, little
by little.

Post by Richard Hodges
They seem to have done that by turning the committee into some kind of
theoretical computing debating society.

I have often thought the same. Just look at std::chrono.

Post by Richard Hodges
It's time the committee woke up and brought useful, practical improvements
to the language, rather than ridiculous meaningless rules that just get in
the way of expressing obvious logic.

I dispute that there aren't practical improvements being added and I dispute
that the rules are meaningless.

Post by Richard Hodges
But the threading library is garbage. Un-useable for anything serious. We
still have to reach for boost, or TBB, or some other third party crutch.
It's embarrassing, and a huge inconvenience when cross-compiling for linux,
ios, android, etc as I and people like me do.

Care to elaborate what your problems are? TBB is not a benchmark, though: it
does a lot more than threading alone.

Post by Richard Hodges
For a language that's geared for correctness and performance, this is
disgraceful.
Someone needs to go through the committee with a stiff broom.

Can that someone be you?
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Demi Obenour

2017-01-17 05:36:42 UTC

I very much disagree.

C++ can use placement new, true. But C cannot, and many programs need to
compile and run as both.

Furthermore, I don't know of any reasonable way a compiler could exploit
this to produce better code. Strict aliasing doesn't apply, since char
pointers can alias anything. More importantly, you make it impossible to
take an aligned char array â say, one filled in by an I/O operation â and
cast it to an array of (say) int without an O(n) copy and a 2x memory
overhead! That is anything BUT fast. C++ should NOT impose such overheads.

Post by Martin Ba
Hi.
I'm currently trying to understand a few ... interesting ...

observations I have been making wrt. the C++ Standard and using char arrays
as raw storage.

Post by Martin Ba
Essentially, as far as I can tell (have been told), the current C++

Standard only allows using a char array as raw storage (see also
std::aligned_storage) when objects are put into this via placement new,
even for e.g. int or other trivial(*) types.

Post by Martin Ba
See: http://stackoverflow.com/questions/41624685/is-placement-

new-legally-required-for-putting-an-int-into-a-char-array or related

Post by Martin Ba
|alignas(int) char buf[sizeof(int)];
void f() {
// turn the memory into an int: (??) from the POV of the abstract

machine!

Post by Martin Ba
::new (buf) int; // is this strictly required? (aside: it's

obviously a no-op)

Post by Martin Ba
*((int*)buf) = 42; // for this discussion, just assume the cast

itself yields the correct pointer value

Post by Martin Ba
}
What I would be interested in is whether this has been discussed in

the committee (CWG?) in the last very few years

Post by Martin Ba
and whether there is any agreement if omitting the placement new (for

If this is really the intent, then this needs to be more clearly
communicated and, I feel, rationalized. (Maybe it already has? Thats what
the OP was actually about.)

If you believe that intent is misguided, feel free to propose a change.

do not permit changing the effective type of a declared object to something
other than its declared type; it only permits that for objects allocated
with malloc or similar.
In the case where the storage /was/ allocated through malloc or similar,
C++ requires a placement new where C simply allows the effective type to
change through a store (and some parts of the C effective type model don't
work as a result...). It would seem reasonable to me for such allocation
functions to be specified to have implicitly created whatever set of
objects the following code relies on existing[1] -- the compiler typically
has to make that pessimistic assumption anyway, since it doesn't know what
objects the implementation of an opaque function might create, so it seems
like we'd lose little and gain more C compatibility by guaranteeing
something like that.
[1]: that is, we could require the compiler to assume that malloc runs a
sequence of placement news (for types with trivial default construction and
destruction) before it returns, where that set is chosen to be whatever set
gives the program defined behavior -- if such a set exists

The result of a "sequence of placement news" on a piece of memory is the
creation of an object of the last type `new`ed. The C++ object model does
not permit storage to have an indeterminate object or many separate objects
(outside of nesting). If you allocate 4 bytes and new an `int` into it,
then it is an int. If you new a `float` into it, it stops being an `int`.
So, how would you suggest the object model change to accommodate such a
thing? What is the syntax that causes a piece of storage that contains all
objects to contain just one?
Personally? I say let it go. C++ programmers have managed to survive this
being UB since at least 2004. We're teaching C++ programmers nowadays to
avoid pointless casting; the average C++ programmer today is far more
likely to employ placement-new than to do casts and assume it was
constructed.
I'd rather the committee spend time shoring up the object model for
genuine C++ purposes, like making it possible for `vector` to be
implemented without UB.
--
---
You received this message because you are subscribed to the Google Groups
"ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/a/isocpp.org/group/std-
discussion/.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Thiago Macieira

2017-01-17 07:41:22 UTC

Post by Demi Obenour
C++ can use placement new, true. But C cannot, and many programs need to
compile and run as both.

That argument doesn't apply. You can write code that compiles as both C and
C++, but that does not mean the rules from one language apply in the other.

If you need to write C++-specific code, you can always just use #ifdef
__cplusplus.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Nicol Bolas

2017-01-17 13:33:15 UTC

Post by Demi Obenour
I very much disagree.
C++ can use placement new, true. But C cannot, and many programs need to
compile and run as both.
Furthermore, I don't know of any reasonable way a compiler could exploit
this to produce better code. Strict aliasing doesn't apply, since char
pointers can alias anything. More importantly, you make it impossible to
take an aligned char array â say, one filled in by an I/O operation â and
cast it to an array of (say) int without an O(n) copy and a 2x memory
overhead! That is anything BUT fast. C++ should NOT impose such overheads.

Or you could just create an `int` array to begin with, then pass it to an
I/O operation to be filled in.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Andrey Semashev

2017-01-17 10:47:20 UTC

Post by Nicol Bolas
The result of a "sequence of placement news" on a piece of memory is the
creation of an object of the last type `new`ed. The C++ object model
does not permit storage to have an indeterminate object or many separate
objects (outside of nesting).

Does it not? Could you provide a reference to the standard?

I have always assumed the following was a well defined code:

char* p = static_cast< char* >(malloc(sizeof(int) + sizeof(float)));
int* pi = new (p) int;
float* pf = new (p + sizeof(int)) float;

(null checks and alignment accounting skipped for brevity).

Regarding the OP, I think it is fair to say that malloc returns a
storage, which is a sequence of bytes (chars) that is allowed to alias
any other type. In that sense, the compiler has no way to know what
actual objects are created by malloc in that storage, so when the user
cases the returned pointer, type aliasing effectively happens. Whether
that is UB or not is a grey area because we don't know if the storage
actually contains the objects that we casted the pointer returned my
malloc to. Regardless, the compiler cannot assume that the code is UB
and e.g. remove it.

Post by Nicol Bolas
Personally? I say let it go. C++ programmers have managed to survive
this being UB since at least 2004. We're teaching C++ programmers
nowadays to avoid pointless casting; the average C++ programmer today is
far more likely to employ placement-new than to do casts and assume it
was constructed.

I disagree, because it requires programmers to write pointless code that
is known to be no-op anyway, just to satisfy the spec.

char* p = static_cast< char* >(malloc(sizeof(int) * 10));

// What is this code written for?
char* pi = p, *pe = p + sizeof(int) * 10;
for (; pi != pe; pi += sizeof(int))
{
new (pi) int;
}

// Use the array of ints
int* q = reinterpret_cast< int* >(p);

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Andrey Semashev

2017-01-17 10:50:52 UTC

Does it not? Could you provide a reference to the standard?
char* p = static_cast< char* >(malloc(sizeof(int) + sizeof(float)));
int* pi = new (p) int;
float* pf = new (p + sizeof(int)) float;
(null checks and alignment accounting skipped for brevity).
Regarding the OP, I think it is fair to say that malloc returns a
storage, which is a sequence of bytes (chars) that is allowed to alias
any other type. In that sense, the compiler has no way to know what
actual objects are created by malloc in that storage, so when the user
cases the returned pointer,

so when the user casts...

Post by Andrey Semashev
type aliasing effectively happens. Whether
that is UB or not is a grey area because we don't know if the storage
actually contains the objects that we casted the pointer returned my
malloc to. Regardless, the compiler cannot assume that the code is UB
and e.g. remove it.

I disagree, because it requires programmers to write pointless code that
is known to be no-op anyway, just to satisfy the spec.
char* p = static_cast< char* >(malloc(sizeof(int) * 10));
// What is this code written for?
char* pi = p, *pe = p + sizeof(int) * 10;
for (; pi != pe; pi += sizeof(int))
{
new (pi) int;
}
// Use the array of ints
int* q = reinterpret_cast< int* >(p);

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Robert Haberlach

2017-01-17 11:00:29 UTC

Does it not? Could you provide a reference to the standard?

See [intro.object]/6 as modified by P0137.

Post by Andrey Semashev
char* p = static_cast< char* >(malloc(sizeof(int) + sizeof(float)));
int* pi = new (p) int;
float* pf = new (p + sizeof(int)) float;
(null checks and alignment accounting skipped for brevity).

Yes, this is well-defined AFAICS.

Post by Andrey Semashev
Regarding the OP, I think it is fair to say that malloc returns a
storage, which is a sequence of bytes (chars) that is allowed to alias
any other type.

Storage cannot be aliased. Objects can.

Post by Andrey Semashev
In that sense, the compiler has no way to know what actual objects are
created by malloc in that storage,

There are no objects in that storage. There is a very specific list of
situations in which objects are created, and calls to malloc are
excluded (and that has always been so).

Post by Andrey Semashev
so when the user cases the returned pointer, type aliasing effectively
happens. Whether that is UB or not is a grey area because we don't
know if the storage actually contains the objects that we casted the
pointer returned my malloc to.

Yes, we do know that--see above. malloc and its semantics are known to
the compiler.

I disagree, because it requires programmers to write pointless code
that is known to be no-op anyway, just to satisfy the spec.

"Pointless" and "satisfy the spec" are contradictions. You have to start
viewing C++ as an abstract language with an abstract object model and
not some type of hacking playground, where as long as your pointers
contain the correct value and the memory is aligned, everything is well.

Post by Andrey Semashev
char* p = static_cast< char* >(malloc(sizeof(int) * 10));
// What is this code written for?
char* pi = p, *pe = p + sizeof(int) * 10;
for (; pi != pe; pi += sizeof(int))
{
new (pi) int;
}

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Andrey Semashev

2017-01-17 11:52:55 UTC

Does it not? Could you provide a reference to the standard?

See [intro.object]/6 as modified by P0137.

Yes, this is well-defined AFAICS.

Post by Andrey Semashev
Regarding the OP, I think it is fair to say that malloc returns a
storage, which is a sequence of bytes (chars) that is allowed to alias
any other type.

Storage cannot be aliased. Objects can.

Post by Andrey Semashev
In that sense, the compiler has no way to know what actual objects are
created by malloc in that storage,

There are no objects in that storage. There is a very specific list of
situations in which objects are created, and calls to malloc are
excluded (and that has always been so).

I don't find where it is excluded. malloc is an opaque function that
returns a void pointer. The compiler has no way to know what objects are
created in the storage accessible through that pointer.

Yes, we do know that--see above. malloc and its semantics are known to
the compiler.

malloc is imported from C in [c.malloc]. In C99, 7.20.3/1, there is this

Post by Robert Haberlach
The pointer returned if the allocation succeeds is suitably aligned
so that it may be assigned to a pointer to any type of object
and then used to access such an object or an array of such objects
in the space allocated (until the space is explicitly deallocated).
The lifetime of an allocated object extends from the allocation until
the deallocation.

So, according to C, the returned pointer may represent whatever object
the pointer is casted to.

Ok, you may argue that that description is given in terms of C, and that
doesn't mean that the same is valid with regard to C++ objects,
including trivial ones like int. Fair enough, but in that case the C++
standard should clarify that. And condidering that there are allocation
functions other than malloc/calloc, the only sane behavior would be the
one compatible with C.

I disagree, because it requires programmers to write pointless code
that is known to be no-op anyway, just to satisfy the spec.

"Pointless" and "satisfy the spec" are contradictions.

No, unless you write code just to satisfy the spec. I personally don't
find that kind of activity productive.

Post by Robert Haberlach
You have to start
viewing C++ as an abstract language with an abstract object model and
not some type of hacking playground, where as long as your pointers
contain the correct value and the memory is aligned, everything is well.

Abstractions are a tool that the spec writers use to describe
generalized behavior of multiple conforming implementations. If that
tool requires people to write pointless code then the tool is broken.

Why didn't you just allocate using new[]?

Because I might have reasons to. E.g. to pass that pointer to a C
library later that I call from my otherwise C++ code. Or we can pretend
that that is not malloc but posix_memalign/aligned_alloc/whatever_alloc
that provides additional properties of the allocated memory, like
increased alignment, that cannot be provided by operator new.

Post by Robert Haberlach
Of course you have to explain
to the implementation that each raw memory location corresponds to an
int, and *you* imposed that burden on yourself--not the language.

The point I'm making is that the requirement to explain this is imposed
by the standard, while there is no technical reason to do that. That is
what I called "writing code just to satisfy the spec".

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Nicol Bolas

2017-01-17 13:39:18 UTC

Post by Robert Haberlach
Of course you have to explain
to the implementation that each raw memory location corresponds to an
int, and *you* imposed that burden on yourself--not the language.

The point I'm making is that the requirement to explain this is imposed
by the standard, while there is no technical reason to do that.

If you know that there really is "no technical reason to do that", then you
ought to be able to propose changes to the specification that permit such a
thing *without* creating an object model that is inherently
self-contradictory or is otherwise broken.

So what *exactly* do you suggest we change? Not merely, "do stuff to make
this work". But what exact changes should be made to the specification?

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Andrey Semashev

2017-01-17 15:09:17 UTC

Post by Robert Haberlach
Of course you have to explain
to the implementation that each raw memory location corresponds to an
int, and *you* imposed that burden on yourself--not the language.

The point I'm making is that the requirement to explain this is imposed
by the standard, while there is no technical reason to do that.
If you know that there really is "no technical reason to do that", then
you ought to be able to propose changes to the specification that permit
such a thing /without/ creating an object model that is inherently
self-contradictory or is otherwise broken.
So what /exactly/ do you suggest we change? Not merely, "do stuff to
make this work". But what exact changes should be made to the specification?

I don't have a concrete proposal. Producing one would require me to
spend significantly more time than I'm currently able to. But I don't
consider that a legitimate argument in favor of disregarding my or the
OP's point. You may disagree, of course.

My gut feeling is that the C++ object model has to allow a POD object to
automatically begin its lifetime whenever it is modified in the raw
storage. This is similar to how an active member of a union begins its
lifetime on the first modification.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Thiago Macieira

2017-01-17 18:12:59 UTC

Em terça-feira, 17 de janeiro de 2017, às 18:09:17 PST, Andrey Semashev

Post by Andrey Semashev
My gut feeling is that the C++ object model has to allow a POD object to
automatically begin its lifetime whenever it is modified in the raw
storage. This is similar to how an active member of a union begins its
lifetime on the first modification.

A POD object's lifetime should begin when storage for it is provided and end
when storage is freed. In that sense, it happens before the modification
through a pointer. In fact, you could say it happens sometime inside malloc().
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Nicol Bolas

2017-01-17 18:33:12 UTC

Em terÃ§a-feira, 17 de janeiro de 2017, Ã s 18:09:17 PST, Andrey Semashev

*Which* POD object? There are an arbitrary number of them that can fit into
that storage. If you say that `malloc(sizeof(void*))` begins the lifetime
of every pointer type, every pointer-to-pointer type, every
pointer-to-pointer-to-pointer etc, then the entire idea of having a typed
object model loses all meaning.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Thiago Macieira

2017-01-17 20:24:24 UTC

Post by Thiago Macieira
A POD object's lifetime should begin when storage for it is provided and end
when storage is freed. In that sense, it happens before the modification
through a pointer. In fact, you could say it happens sometime inside malloc().

It's unspecified which one, but one only. That means the compiler cannot assume
that the code did not initialise, but it can infer from code that uses that
storage area what type it was.

struct S { int i; };
struct T { float f; };

auto ptr = malloc(4);

// the next line tells the compiler that there's an S object there
S *s = static_cast<S *>(ptr);
s->i = 0;

// the following line aliases S's storage with a struct that does not
// have a common sequence, so it's UB
T *t = static_cast<T *>(ptr);

// it would be as wrong as
T *t = reinterpret_cast<T *>(s);
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Nicol Bolas

2017-01-17 21:04:55 UTC

Post by Thiago Macieira
A POD object's lifetime should begin when storage for it is provided

and

Post by Thiago Macieira
end
when storage is freed. In that sense, it happens before the

modification

Post by Thiago Macieira
through a pointer. In fact, you could say it happens sometime inside malloc().

*Which* POD object? There are an arbitrary number of them that can fit

into

Post by Nicol Bolas
that storage. If you say that `malloc(sizeof(void*))` begins the

lifetime

Post by Nicol Bolas
of every pointer type, every pointer-to-pointer type, every
pointer-to-pointer-to-pointer etc, then the entire idea of having a

typed

Post by Nicol Bolas
object model loses all meaning.

It's unspecified which one, but one only. That means the compiler cannot assume
that the code did not initialise, but it can infer from code that uses that
storage area what type it was.

Standards do not work based off of vague inferences. They have to *specify*
behavior. So if an "inference" is going to be made, then there *must* be an
explicit enumeration of syntactic constructs which the compiler will use to
"infer" the type.

At which point, those syntaxes are not being used to "infer" something;
they now become alternate syntaxes for creating an object.

If you say that `static_cast` can begin the lifetime of an object, then you
need to say under which circumstances that will happen. Does it only work
from memory fresh out of `malloc`, or can it work on `malloc`ed memory that
used to have an object in it and you're now replacing it with another? How
do you tell the difference between pointer conversion and object
initialization?

C++ has a way to tell the difference: `static_cast` is for pointer
conversion; `new()` is for object initialization. Because of that, it can
do this:

struct S { int i; };
struct T { float f; };

auto mem = malloc(std::max(sizeof(S), sizeof(T)));

S *s = new(mem) S;
S->i;

T *t = static_cast<T*>(mem); //OK, but you can't use `t`.

T *t2 = new(mem) T; //Can use `t2`, but not `s` anymore.
t = std::launder(t); //I can use `t` now.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Thiago Macieira

2017-01-18 05:08:39 UTC

Post by Nicol Bolas
Standards do not work based off of vague inferences. They have to *specify*
behavior. So if an "inference" is going to be made, then there *must* be an
explicit enumeration of syntactic constructs which the compiler will use to
"infer" the type.

It wasn't a vague inference. It was a logical conclusion based on existing
rules.

If the compiler sees you casting a memory block to a given class, unless it
has reason to doubt you, it should trust you that you're right and that
pointer points to an area containing an object of that type. That is, if your
code is:

extern "C" void *allocate();
void *ptr = allocate();
S *s = static_cast<S *>(ptr);
s->i = 0;

why should it doubt you? What's to say that the allocate function isn't:

void *allocate() { return new S; }

To be explicit: unless the compiler can prove that the allocation function
*isn't* allocating an S, it has no reason to doubt your casting.

Post by Nicol Bolas
At which point, those syntaxes are not being used to "infer" something;
they now become alternate syntaxes for creating an object.

As I said in another email, static_cast does not create the object and nor
does the dereferencing of that pointer. The creation of the POD object
happened in the allocation of the storage, since the constructor is trivial.

Post by Nicol Bolas
If you say that `static_cast` can begin the lifetime of an object, then you
need to say under which circumstances that will happen. Does it only work
from memory fresh out of `malloc`, or can it work on `malloc`ed memory that
used to have an object in it and you're now replacing it with another? How
do you tell the difference between pointer conversion and object
initialization?

See above. Initialisation happens inside the allocation function, not on
casting.

Now, the lifetime can end if you repurpose the storage by memcpy'ing something
else there. See my other email where I said memcpy can be the same as:

x2.~X();
new (&x2) X(x1);

Post by Nicol Bolas
C++ has a way to tell the difference: `static_cast` is for pointer
conversion; `new()` is for object initialization. Because of that, it can
struct S { int i; };
struct T { float f; };
auto mem = malloc(std::max(sizeof(S), sizeof(T)));
S *s = new(mem) S;
S->i;
T *t = static_cast<T*>(mem); //OK, but you can't use `t`.

Actually, I've seen UBSan complain about a static cast of the wrong type, so
you shouldn't cast to the wrong type, even if you don't use the pointer.
Though in that case we were talking about polymorphic types and here we're
talking about trivial ones.

Post by Nicol Bolas
T *t2 = new(mem) T; //Can use `t2`, but not `s` anymore.
t = std::launder(t); //I can use `t` now.

Agreed, your code is fine. And using the placement new allows us to be explicit
about the object initialisation and also safe if the type in question isn't
trivially constructible.

But the compiler cannot prove that malloc didn't initialise the object before
it returned. Take the allocate() function from above: if we expand the
operator new, we get:

void *allocate()
{
auto ptr = ::operator new(sizeof(S));
new (ptr) S;
return ptr;
}

But since S has a trivial constructor, the placement new must expand to
absolutely nothing and have no side effects. Therefore, that function is
functionally identical to:

void *allocate() { return ::operator new(sizeof(S)); }

Finally, since the default ::operator new function just calls malloc, it's no
different from:

void *allocate() { return malloc(sizeof(S)); }

To me, this proves that you cannot distinguish malloc() or any other memory
allocation function from a function that initialises a trivial object.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Nicol Bolas

2017-01-18 16:08:25 UTC

Post by Nicol Bolas
Standards do not work based off of vague inferences. They have to

*specify*

Post by Nicol Bolas
behavior. So if an "inference" is going to be made, then there *must* be

Post by Nicol Bolas
explicit enumeration of syntactic constructs which the compiler will use

Post by Nicol Bolas
"infer" the type.

There's no question of doubting or not doubting. The question is "does
static_cast create an object?" Because that's what you seem to want.

If that's what you want, then *say that*. That's what I'm talking about
with "inferring;" it's nonsense. Either performing an operation creates a
*specific* object or it doesn't. If it does, then we need to know exactly
which operations should be creating exactly which objects and under exactly
which circumstances.

Without that, you can't have a specification.

Post by Nicol Bolas
void *allocate() { return new S; }
To be explicit: unless the compiler can prove that the allocation function
*isn't* allocating an S, it has no reason to doubt your casting.

Again, there's this pointless question of doubting or not doubting. C++ is
very clear here: if your `allocate` function did indeed return an `S`
object, then your code works. If it did not return an `S` object, then you
get UB.

Doubt is not in question; the compiler is doing what it was told: if there
was no object there, then you get UB.

The problem being discussed in this thread is when *there is no object
there*. When there is merely a collection of bits that might form that
object, but no object actually exists there yet. No object has been started
in accord with the rules of [intro.object]/1.

Pretending that there is an object somewhere when there isn't is UB.

The only way to make that code legal, while still having a memory model
that is coherent, is to have one of the operations in that code actually
create the object.

So which operation should it be?

Post by Nicol Bolas
At which point, those syntaxes are not being used to "infer" something;

Post by Nicol Bolas
they now become alternate syntaxes for creating an object.

Which POD? If I call `malloc`, which POD did it create? It cannot
simultaneously create all PODs. So which one was it?

Because if you cannot answer that question, then you have a dysfunctional
memory model. We cannot have a Schrodinger's Cat memory model, where the
memory contains some quantum state object that is all POD types
simultaneously until you first look at it.

Post by Nicol Bolas
If you say that `static_cast` can begin the lifetime of an object, then you

Post by Nicol Bolas
need to say under which circumstances that will happen. Does it only

work

Post by Nicol Bolas
from memory fresh out of `malloc`, or can it work on `malloc`ed memory

that

Post by Nicol Bolas
used to have an object in it and you're now replacing it with another?

How

Post by Nicol Bolas
do you tell the difference between pointer conversion and object
initialization?

See above. Initialisation happens inside the allocation function, not on
casting.

Initialization is irrelevant. Objects can be created without initializing
them; trivial default constructors do it all the time.

What matters is when an object is *created*.

Now, the lifetime can end if you repurpose the storage by memcpy'ing

Post by Nicol Bolas
something
x2.~X();
new (&x2) X(x1);

Post by Nicol Bolas
C++ has a way to tell the difference: `static_cast` is for pointer
conversion; `new()` is for object initialization. Because of that, it

can

Post by Nicol Bolas
struct S { int i; };
struct T { float f; };
auto mem = malloc(std::max(sizeof(S), sizeof(T)));
S *s = new(mem) S;
S->i;
T *t = static_cast<T*>(mem); //OK, but you can't use `t`.

Actually, I've seen UBSan complain about a static cast of the wrong type, so
you shouldn't cast to the wrong type, even if you don't use the pointer.

But it doesn't provoke UB; [expr.static.cast]/13 says so. It's accessing
data through that pointer that provokes UB. So such "UBSan" are wrong.

Post by Nicol Bolas
Though in that case we were talking about polymorphic types and here we're
talking about trivial ones.

Post by Nicol Bolas
T *t2 = new(mem) T; //Can use `t2`, but not `s` anymore.
t = std::launder(t); //I can use `t` now.

The compiler most certainly can prove that malloc did not create an object
of type `S`. Because `std::malloc` is not specified to create an object of *any
type*; it merely allocates memory. As such, the compiler is free to assume
that the memory `std::malloc` returns has no objects in it.

Post by Nicol Bolas
Take the allocate() function from above: if we expand the
void *allocate()
{
auto ptr = ::operator new(sizeof(S));
new (ptr) S;
return ptr;
}
But since S has a trivial constructor, the placement new must expand to
absolutely nothing and have no side effects. Therefore, that function is
void *allocate() { return ::operator new(sizeof(S)); }

Please read [intro.object]/1. The two pieces of code are not identical. No
matter the fact that `new(ptr) S` will be a no-op for real-life compilers,
it still *does something*. It creates the object. Without that line, you
have storage that contains no objects.

Finally, since the default ::operator new function just calls malloc, it's

Post by Nicol Bolas
no
void *allocate() { return malloc(sizeof(S)); }
To me, this proves that you cannot distinguish malloc() or any other memory
allocation function from a function that initialises a trivial object.

It proves no such thing. Given sufficient power and inlining, the compiler
can recognize that you're calling `std::malloc`, which as defined by the
standard does not create an object. The compiler can see that `allocate`
also does not create a C++ object. Therefore, the compiler has all the
information to know that, if you don't create an object in that memory
yourself, then trying to access it will invoke UB. And therefore, the
compiler is technically free to wipe out any of those operations that don't
have side-effects. For example:

auto ptr = allocate();
auto s_ptr = static_cast<S*>(ptr);
s_ptr->i = 5;

Memory allocation has side-effects, so they can't be wiped out. But the
third line could be freely ignored by a really smart compiler, because it
knows that there is no `S` object there.

Now granted, I have no idea why someone would write a compiler that does
that. If `allocate` did indeed create an `S` object, then the compiler
would compile the code as normal. And if `allocate` doesn't create an `S`
object... what would be the point of removing the third line? The user may
have provoked UB, but you're not saving any performance in a well-defined
program.

But the fear some people have on this thread is that the C++ memory model
permits the compiler to throw out such code. And therefore, should be
"fixed"... somehow.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Thiago Macieira

2017-01-18 17:03:00 UTC

Post by Nicol Bolas
There's no question of doubting or not doubting. The question is "does
static_cast create an object?" Because that's what you seem to want.

It does not. I thought I was very explicit. The object was created before the
static_cast.

Post by Thiago Macieira
void *allocate() { return new S; }
To be explicit: unless the compiler can prove that the allocation function
*isn't* allocating an S, it has no reason to doubt your casting.

Agreed. There's no doubting that.

What I am saying is that in an opaque function returning a void*, the compiler
cannot infer that that it did not create an S. Therefore, it must allow for
the possibility that it did and cannot optimise otherwise.

Post by Nicol Bolas
The problem being discussed in this thread is when *there is no object
there*. When there is merely a collection of bits that might form that
object, but no object actually exists there yet. No object has been started
in accord with the rules of [intro.object]/1.
Pretending that there is an object somewhere when there isn't is UB.

I disagree. POD & trivial object lifetimes begin when storage for them is
allocated. Therefore, a POD object could exist there with just a collection of
unspecified bits.

Post by Thiago Macieira
void *allocate() { return ::operator new(sizeof(S)); }

How can an operation that does nothing do something?

The ultimate allocation function is always opaque and non-inline: if it's not
allocate() above, it's ::operator new(). If it's not ::operator new(), then
it's malloc(). If it's not malloc(), then it's a system call like VirtualAlloc
or brk() or mmap(). So even if the compiler can inline almost everything, it
stops at some opaque boundary that returns a pointer to a bag of bits. That
being the case, the compiler does not know what the allocation function did or
did not do. Therefore, it must account for the possibility that the function
did placement-new S there.

And even if the deep embedded system has no system calls and simply finds
memory from a list somewhere, it usually cannot "remember" what happened in
that memory allocation, so it cannot rule out that the new S did happen some
time in the past.

The way I see it, this is reality. Even if we made the standard language
stupid and try to diverge from reality, it wouldn't change the behaviour
because the compiler could never prove UB and therefore optimise your code to
be different from what it is today.

Post by Nicol Bolas
It proves no such thing. Given sufficient power and inlining, the compiler
can recognize that you're calling `std::malloc`, which as defined by the
standard does not create an object. The compiler can see that `allocate`
also does not create a C++ object. Therefore, the compiler has all the
information to know that, if you don't create an object in that memory
yourself, then trying to access it will invoke UB. And therefore, the
compiler is technically free to wipe out any of those operations that don't
have side-effects.

We should change the language of the standard to say that std::malloc could
have created a trivially-constructible object there.

Post by Nicol Bolas
auto ptr = allocate();
auto s_ptr = static_cast<S*>(ptr);
s_ptr->i = 5;
Memory allocation has side-effects, so they can't be wiped out. But the
third line could be freely ignored by a really smart compiler, because it
knows that there is no `S` object there.

Agreed, it could, but only if the compiler can prove that there is no S there.
What I am proposing is that we tell the compiler it doesn't know what's there.
Therefore, an S *could* be there.

Post by Nicol Bolas
But the fear some people have on this thread is that the C++ memory model
permits the compiler to throw out such code. And therefore, should be
"fixed"... somehow.

Yes.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Robert Haberlach

2017-01-17 13:52:40 UTC

Post by Robert Haberlach
There is a very specific list of
situations in which objects are created, and calls to malloc are
excluded (and that has always been so).

I don't find where it is excluded. malloc is an opaque function that
returns a void pointer. The compiler has no way to know what objects
are created in the storage accessible through that pointer.

It isn't explicitly excluded; rather, malloc is specified as allocating
storage, and the list does not mention malloc as creating an object
(e.g. a char array).

Post by Andrey Semashev
Ok, you may argue that that description is given in terms of C, and
that doesn't mean that the same is valid with regard to C++ objects,
including trivial ones like int. Fair enough, but in that case the C++
standard should clarify that.

It doesn't need to; the status quo (as mentioned in P0137 as a drafting
note) is that malloc does not create objects. Period. You are needlessly
sceptic about things that are clear to everyone else.

Post by Andrey Semashev
And condidering that there are allocation functions other than
malloc/calloc, the only sane behavior would be the one compatible with C.

I disagree.

I disagree, because it requires programmers to write pointless code
that is known to be no-op anyway, just to satisfy the spec.

"Pointless" and "satisfy the spec" are contradictions.

No, unless you write code just to satisfy the spec. I personally don't
find that kind of activity productive.

We have a "spec" (international standard) for C++, and you can either
write code conforming to it or not. But don't complain if non-conforming
code doesn't execute as you intended. The spec did not say "repeat each
variable's declaration for no reason". It says "if you want to use an
uninitialized memory location as an object of some type T, you must
express this by employing placement new on it". Just because it's a
no-op it's not pointless.

Post by Robert Haberlach
Of course you have to explain
to the implementation that each raw memory location corresponds to an
int, and *you* imposed that burden on yourself--not the language.

The point I'm making is that the requirement to explain this is
imposed by the standard, while there is no technical reason to do
that. That is what I called "writing code just to satisfy the spec".

If you omit it, the optimizer will screw with your code. But I guess
there's little point arguing with some kind of C fundamentalist.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Richard Hodges

2017-01-17 12:37:52 UTC

It seems to me that this entire discussion can be paraphrased as, "is c++
compatible with c or not?". [compatibility being defined as PODs legally
created by one are valid in the other].

The answer appears, at least insofar as the standard is concerned, "no, or
at best, implementation defined".

I think it would be fair to say that the vast majority of c++'s user base
would hope that the answer is, "yes".

I think it's therefore reasonable that the standard should give PODs
special treatment, allowing them to have been created (default-initialised)
simply by allocating properly aligned storage of sufficient size.

This would then allow c++ to interoperate with c (in both directions) both
legally and *de-facto.*

Since the standard does seem to acknowledge the C language, is there any
reasonable argument that suggests that this explicit interoperability
should not be in the standard?

R

Does it not? Could you provide a reference to the standard?

See [intro.object]/6 as modified by P0137.

Yes, this is well-defined AFAICS.

Post by Andrey Semashev
Regarding the OP, I think it is fair to say that malloc returns a
storage, which is a sequence of bytes (chars) that is allowed to alias
any other type.

Storage cannot be aliased. Objects can.

Post by Andrey Semashev
In that sense, the compiler has no way to know what actual objects are
created by malloc in that storage,

There are no objects in that storage. There is a very specific list of
situations in which objects are created, and calls to malloc are
excluded (and that has always been so).

Yes, we do know that--see above. malloc and its semantics are known to
the compiler.

I disagree, because it requires programmers to write pointless code
that is known to be no-op anyway, just to satisfy the spec.

Why didn't you just allocate using new[]? Of course you have to explain
to the implementation that each raw memory location corresponds to an
int, and *you* imposed that burden on yourself--not the language. If
you're so into malloc, try C.
--
---
You received this message because you are subscribed to the Google Groups
"ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/a/isocpp.org/group/std-
discussion/.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Nicol Bolas

2017-01-17 13:45:37 UTC

Post by Richard Hodges
It seems to me that this entire discussion can be paraphrased as, "is c++
compatible with c or not?". [compatibility being defined as PODs legally
created by one are valid in the other].
The answer appears, at least insofar as the standard is concerned, "no, or
at best, implementation defined".
I think it would be fair to say that the vast majority of c++'s user base
would hope that the answer is, "yes".
I think it's therefore reasonable that the standard should give PODs
special treatment, allowing them to have been created (default-initialised)
simply by allocating properly aligned storage of sufficient size.
This would then allow c++ to interoperate with c (in both directions) both
legally and *de-facto.*

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Richard Hodges

2017-01-17 14:09:00 UTC

Post by Nicol Bolas
It would also create a dysfunctional C++ object model. A piece of storage

would have to have *every* object that could fit into it all at the same
time. >You're basically saying that any piece of memory should be able to
be treated as a union of all appropriate types.

Post by Nicol Bolas
As I understand it, that's not even the *C* object model; it manifests

objects in arbitrary memory by you writing to them.

OK, I think it's reasonable to require an intrinsic to have been written to
before it is an 'object'. That makes sense as it allows the use of
sentinels etc. Presumably this is why C mandates it this way.

So I'll modify my argument while continuing on the theme: What's sauce for
the goose ought to be sauce for the supposedly compatible gander.

I argue (and I don't think I am alone) that for intrinsics and PODs solely
thereof, elements that have been written to, either by C or C++ ought to
have been constructed. If they have been written to through a cast of a
correctly aligned memory pointer, they ought to 'exist' *in that actual
memory [*subject to as-if-compatible optimisations, of course*]*. This is
what we expect in C, and it is arguably what we would be reasonable to
expect in C++.

I accept that this would require a differentiation in handling between PODs
and non-POD structs in the standard. I think that's reasonable:

When constructors, destructors, copy, move ops are non-trivial, we expect
'object-like' behaviour. When they are trivial (particularly when they are
NOPs) we expect memory-like behaviour.

Again, this is the de-facto reality on which the code base of every c++
program that calls a C library depends. Why not codify a de-facto reality
in order to legitimise it?

Post by Richard Hodges
It seems to me that this entire discussion can be paraphrased as, "is c++
compatible with c or not?". [compatibility being defined as PODs legally
created by one are valid in the other].
The answer appears, at least insofar as the standard is concerned, "no,
or at best, implementation defined".
I think it would be fair to say that the vast majority of c++'s user base
would hope that the answer is, "yes".
I think it's therefore reasonable that the standard should give PODs
special treatment, allowing them to have been created (default-initialised)
simply by allocating properly aligned storage of sufficient size.
This would then allow c++ to interoperate with c (in both directions)
both legally and *de-facto.*

It would also create a dysfunctional C++ object model. A piece of storage
would have to have *every* object that could fit into it all at the same
time. You're basically saying that any piece of memory should be able to be
treated as a union of all appropriate types.
As I understand it, that's not even the *C* object model; it manifests
objects in arbitrary memory by you writing to them.
--
---
You received this message because you are subscribed to the Google Groups
"ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/a/isocpp.org/group/std-
discussion/.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Nicol Bolas

2017-01-17 15:53:00 UTC

Post by Nicol Bolas
It would also create a dysfunctional C++ object model. A piece of

storage would have to have *every* object that could fit into it all at
the same time. >You're basically saying that any piece of memory should be
able to be treated as a union of all appropriate types.

Post by Nicol Bolas
As I understand it, that's not even the *C* object model; it manifests

objects in arbitrary memory by you writing to them.
OK, I think it's reasonable to require an intrinsic to have been written
to before it is an 'object'. That makes sense as it allows the use of
sentinels etc. Presumably this is why C mandates it this way.
So I'll modify my argument while continuing on the theme: What's sauce for
the goose ought to be sauce for the supposedly compatible gander.
I argue (and I don't think I am alone) that for intrinsics and PODs solely
thereof, elements that have been written to, either by C or C++ ought to
have been constructed. If they have been written to through a cast of a
correctly aligned memory pointer, they ought to 'exist' *in that actual
memory [*subject to as-if-compatible optimisations, of course*]*. This is
what we expect in C, and it is arguably what we would be reasonable to
expect in C++.
I accept that this would require a differentiation in handling between
When constructors, destructors, copy, move ops are non-trivial, we expect
'object-like' behaviour. When they are trivial (particularly when they are
NOPs) we expect memory-like behaviour.
Again, this is the de-facto reality on which the code base of every c++
program that calls a C library depends. Why not codify a de-facto reality
in order to legitimise it?

Because making something de-jure requires actually pinning down what we're
talking about, rather than making broad generalizations about the way we
think things ought to work. It's easy to say what you're saying, but to
actually make it work without the object model becoming contradictory is a
huge process.

And even then, it wouldn't get you all of your C-isms.

For example, let's say you malloc some memory and pass it along to a C API
that is going to fill that memory with consecutive `int`s. OK, fine; that
memory now has a bunch of `int`s in it.

Know what it *doesn't* have? An *array*:

auto alloc = malloc(sizeof(int) * n);
get_ints(alloc, n, ...);
auto ints = reinterpret_cast<int*>(alloc);
ints[5]; //UB

This is the foundation of where `vector` has to rely on UB to get things
done. Because the standard makes a distinction between objects that happen
to be sequential in a piece of storage and an *array* of objects. And
pointer arithmetic, the basis of [], only works on arrays. Pointer
arithmetic is explicitly not allowed to jump from one object to another
object unless they are in the same array.

Go ahead; take a look at [expr.add]. It only works for arrays. And the C
API did not create an array.

Then there's the question of whether you're using the same type as the
function filling in the object. A C API may have a `typedef struct` that
contains an `int` and a `float`. If you declare a C++ struct that is layout
compatible with it, then you can memcpy this data it into your C++ struct
equivalent (if it is also trivially copyable). But unless you are using the
same type definition as the C function that generated it, the objects that
C created were *their* struct. And therefore, it is UB to simply cast that
pointer to your C++ struct and start accessing it.

So now what? Do we say that if two standard layout types are layout
compatible, it's OK to just pretend that one type is another type? Because
that throws strict aliasing right out the window.

Then there's the question of why you need to restrict this to POD types at
all. If what causes an object to come into being is copying into storage,
would it not make sense to broaden the limitation based on that? That is,
allow any type for which bitwise copies make sense? IE: trivially copyable.
This is after all why C++11 created the standard-layout/trivially-copyable
distinction: because POD is too narrow of a limitation based on what you're
actually doing.

Now, some of these issues are things we need to fix (the specification of
pointer arithmetic not allowing you to access sequential objects of the
same type as though they were an array, for example). But overall, making
C-isms legal C++ code is a lot of standardization work for little real gain.

Let C-isms remain UB. Encourage C++ programmers to follow C++ practices.
And move forward.

The most I would be willing to see is a statement that code external to the
C++ program is permitted to create objects compatible with C++'s object
model, on an implementation-defined basis. But we should not standardize
C-isms within actual C++ code.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Robert Haberlach

2017-01-17 16:25:46 UTC

Post by Nicol Bolas
The most I would be willing to see is a statement that code external
to the C++ program is permitted to create objects compatible with
C++'s object model, on an implementation-defined basis.

Cf. footnote 40:

"40) This section does not impose restrictions on indirection through
pointers to memory not allocated by ::operator new. This maintains the
ability of many C++ implementations to use binary libraries and
components written in other languages. In particular, this applies to C
binaries, because indirection through pointers to memory allocated by
std::malloc is not restricted."

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Richard Hodges

2017-01-17 16:32:52 UTC

I cannot see any reasonable argument that pointer arithmetic should not be
allowed to work on consecutive objects. int* p = get_ints_from_c();
*(++p); *should
absolutely* be defined behaviour in c++ provided there is actually some
memory at std::addressof(*p) + sizeof(p); - there is no conceivable reason
why it should not.

Note that I am asserting *should absolutely* - a very strong statement.
This is because we absolutely cannot move away from c. There are no c++
operating systems. Therefore all useful libraries are written with C
interfaces. Thousands of c++ wrapper libraries exist to turn those C
interfaces back into c++. We don't do that because we want to. We do that
because C++ is not suitable creating portable object libraries, having no
modules or common ABI.

By all means lets talk about moving forward - after we have modules,
defined ABIs, an agreed-upon means to transmitting exceptions and so on.

Until then, the entire foundation of our C++ universe is C. To try to
pretend otherwise is a fallacy.

OK, it's "difficult" to marry the c++ abstract machine model with the C
memory model. So what? That doesn't mean that it should not be done.
Clearly the definition of the C++ abstract machine needs to be revised, or
made more granular. Difficulty does not come into it.

Should bitwise copies of non-trivial objects be allowed? Of course not. We
can express the reason why not as a high level "because of memory model
concerns" or we can be truthful: "because the pointers will be wrong and
your double delete will crash the program". C++ is a low level high level
language. We should not be afraid of talking about memory, addresses and
pointers. That is what they are.

As much as the standards committee wants to pretend that C++ is not "C with
classes, templates and exceptions", they are wrong. It is, and it will
always be. If it wasn't, it wouldn't work because rather than relying on a
million open-source contributors to C libraries in order to create anything
but the most trivial and useless program, there would have to be a central
team writing all the c++ libraries for sound, graphics, crypto, comms, etc,
etc, etc.

If that were the case then we may as well dump all of the C compatibility
and go for D.

C++ is an evolution of C, it relies on C in its standard libraries. Its
user community relies on its compatibility with C. Its memory should be
100% compatible. Optimisers can cope.

Post by Nicol Bolas
It would also create a dysfunctional C++ object model. A piece of

Post by Nicol Bolas
As I understand it, that's not even the *C* object model; it manifests

objects in arbitrary memory by you writing to them.
OK, I think it's reasonable to require an intrinsic to have been written
to before it is an 'object'. That makes sense as it allows the use of
sentinels etc. Presumably this is why C mandates it this way.
So I'll modify my argument while continuing on the theme: What's sauce
for the goose ought to be sauce for the supposedly compatible gander.
I argue (and I don't think I am alone) that for intrinsics and PODs
solely thereof, elements that have been written to, either by C or C++
ought to have been constructed. If they have been written to through a cast
of a correctly aligned memory pointer, they ought to 'exist' *in that
actual memory [*subject to as-if-compatible optimisations, of course*]*.
This is what we expect in C, and it is arguably what we would be reasonable
to expect in C++.
I accept that this would require a differentiation in handling between
When constructors, destructors, copy, move ops are non-trivial, we expect
'object-like' behaviour. When they are trivial (particularly when they are
NOPs) we expect memory-like behaviour.
Again, this is the de-facto reality on which the code base of every c++
program that calls a C library depends. Why not codify a de-facto reality
in order to legitimise it?

Because making something de-jure requires actually pinning down what we're
talking about, rather than making broad generalizations about the way we
think things ought to work. It's easy to say what you're saying, but to
actually make it work without the object model becoming contradictory is a
huge process.
And even then, it wouldn't get you all of your C-isms.
For example, let's say you malloc some memory and pass it along to a C API
that is going to fill that memory with consecutive `int`s. OK, fine; that
memory now has a bunch of `int`s in it.
auto alloc = malloc(sizeof(int) * n);
get_ints(alloc, n, ...);
auto ints = reinterpret_cast<int*>(alloc);
ints[5]; //UB
This is the foundation of where `vector` has to rely on UB to get things
done. Because the standard makes a distinction between objects that happen
to be sequential in a piece of storage and an *array* of objects. And
pointer arithmetic, the basis of [], only works on arrays. Pointer
arithmetic is explicitly not allowed to jump from one object to another
object unless they are in the same array.
Go ahead; take a look at [expr.add]. It only works for arrays. And the C
API did not create an array.
Then there's the question of whether you're using the same type as the
function filling in the object. A C API may have a `typedef struct` that
contains an `int` and a `float`. If you declare a C++ struct that is layout
compatible with it, then you can memcpy this data it into your C++ struct
equivalent (if it is also trivially copyable). But unless you are using the
same type definition as the C function that generated it, the objects that
C created were *their* struct. And therefore, it is UB to simply cast
that pointer to your C++ struct and start accessing it.
So now what? Do we say that if two standard layout types are layout
compatible, it's OK to just pretend that one type is another type? Because
that throws strict aliasing right out the window.
Then there's the question of why you need to restrict this to POD types at
all. If what causes an object to come into being is copying into storage,
would it not make sense to broaden the limitation based on that? That is,
allow any type for which bitwise copies make sense? IE: trivially copyable.
This is after all why C++11 created the standard-layout/trivially-copyable
distinction: because POD is too narrow of a limitation based on what you're
actually doing.
Now, some of these issues are things we need to fix (the specification of
pointer arithmetic not allowing you to access sequential objects of the
same type as though they were an array, for example). But overall, making
C-isms legal C++ code is a lot of standardization work for little real gain.
Let C-isms remain UB. Encourage C++ programmers to follow C++ practices.
And move forward.
The most I would be willing to see is a statement that code external to
the C++ program is permitted to create objects compatible with C++'s object
model, on an implementation-defined basis. But we should not standardize
C-isms within actual C++ code.
--
---
You received this message because you are subscribed to the Google Groups
"ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/a/isocpp.org/group/std-
discussion/.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Demi Obenour

2017-01-17 18:20:10 UTC

THIS. C++'s main selling point is C compatibility. Much code is REQUIRED
to compile as BOTH C and C++. Without changes.

No compiler will dare treat this as undefined behavior unless it is also
undefined in C.

Post by Richard Hodges
I cannot see any reasonable argument that pointer arithmetic should not be
allowed to work on consecutive objects. int* p = get_ints_from_c(); *(++p); *should
absolutely* be defined behaviour in c++ provided there is actually some
memory at std::addressof(*p) + sizeof(p); - there is no conceivable reason
why it should not.
Note that I am asserting *should absolutely* - a very strong statement.
This is because we absolutely cannot move away from c. There are no c++
operating systems. Therefore all useful libraries are written with C
interfaces. Thousands of c++ wrapper libraries exist to turn those C
interfaces back into c++. We don't do that because we want to. We do that
because C++ is not suitable creating portable object libraries, having no
modules or common ABI.
By all means lets talk about moving forward - after we have modules,
defined ABIs, an agreed-upon means to transmitting exceptions and so on.
Until then, the entire foundation of our C++ universe is C. To try to
pretend otherwise is a fallacy.
OK, it's "difficult" to marry the c++ abstract machine model with the C
memory model. So what? That doesn't mean that it should not be done.
Clearly the definition of the C++ abstract machine needs to be revised, or
made more granular. Difficulty does not come into it.
Should bitwise copies of non-trivial objects be allowed? Of course not. We
can express the reason why not as a high level "because of memory model
concerns" or we can be truthful: "because the pointers will be wrong and
your double delete will crash the program". C++ is a low level high level
language. We should not be afraid of talking about memory, addresses and
pointers. That is what they are.
As much as the standards committee wants to pretend that C++ is not "C
with classes, templates and exceptions", they are wrong. It is, and it will
always be. If it wasn't, it wouldn't work because rather than relying on a
million open-source contributors to C libraries in order to create anything
but the most trivial and useless program, there would have to be a central
team writing all the c++ libraries for sound, graphics, crypto, comms, etc,
etc, etc.
If that were the case then we may as well dump all of the C compatibility
and go for D.
C++ is an evolution of C, it relies on C in its standard libraries. Its
user community relies on its compatibility with C. Its memory should be
100% compatible. Optimisers can cope.

Post by Nicol Bolas
It would also create a dysfunctional C++ object model. A piece of

Post by Nicol Bolas
As I understand it, that's not even the *C* object model; it

manifests objects in arbitrary memory by you writing to them.
OK, I think it's reasonable to require an intrinsic to have been written
to before it is an 'object'. That makes sense as it allows the use of
sentinels etc. Presumably this is why C mandates it this way.
So I'll modify my argument while continuing on the theme: What's sauce
for the goose ought to be sauce for the supposedly compatible gander.
I argue (and I don't think I am alone) that for intrinsics and PODs
solely thereof, elements that have been written to, either by C or C++
ought to have been constructed. If they have been written to through a cast
of a correctly aligned memory pointer, they ought to 'exist' *in that
actual memory [*subject to as-if-compatible optimisations, of course*]*.
This is what we expect in C, and it is arguably what we would be reasonable
to expect in C++.
I accept that this would require a differentiation in handling between
When constructors, destructors, copy, move ops are non-trivial, we
expect 'object-like' behaviour. When they are trivial (particularly when
they are NOPs) we expect memory-like behaviour.
Again, this is the de-facto reality on which the code base of every c++
program that calls a C library depends. Why not codify a de-facto reality
in order to legitimise it?

Because making something de-jure requires actually pinning down what
we're talking about, rather than making broad generalizations about the way
we think things ought to work. It's easy to say what you're saying, but to
actually make it work without the object model becoming contradictory is a
huge process.
And even then, it wouldn't get you all of your C-isms.
For example, let's say you malloc some memory and pass it along to a C
API that is going to fill that memory with consecutive `int`s. OK, fine;
that memory now has a bunch of `int`s in it.
auto alloc = malloc(sizeof(int) * n);
get_ints(alloc, n, ...);
auto ints = reinterpret_cast<int*>(alloc);
ints[5]; //UB
This is the foundation of where `vector` has to rely on UB to get things
done. Because the standard makes a distinction between objects that happen
to be sequential in a piece of storage and an *array* of objects. And
pointer arithmetic, the basis of [], only works on arrays. Pointer
arithmetic is explicitly not allowed to jump from one object to another
object unless they are in the same array.
Go ahead; take a look at [expr.add]. It only works for arrays. And the C
API did not create an array.
Then there's the question of whether you're using the same type as the
function filling in the object. A C API may have a `typedef struct` that
contains an `int` and a `float`. If you declare a C++ struct that is layout
compatible with it, then you can memcpy this data it into your C++ struct
equivalent (if it is also trivially copyable). But unless you are using the
same type definition as the C function that generated it, the objects that
C created were *their* struct. And therefore, it is UB to simply cast
that pointer to your C++ struct and start accessing it.
So now what? Do we say that if two standard layout types are layout
compatible, it's OK to just pretend that one type is another type? Because
that throws strict aliasing right out the window.
Then there's the question of why you need to restrict this to POD types
at all. If what causes an object to come into being is copying into
storage, would it not make sense to broaden the limitation based on that?
That is, allow any type for which bitwise copies make sense? IE: trivially
copyable. This is after all why C++11 created the
standard-layout/trivially-copyable distinction: because POD is too
narrow of a limitation based on what you're actually doing.
Now, some of these issues are things we need to fix (the specification of
pointer arithmetic not allowing you to access sequential objects of the
same type as though they were an array, for example). But overall, making
C-isms legal C++ code is a lot of standardization work for little real gain.
Let C-isms remain UB. Encourage C++ programmers to follow C++ practices.
And move forward.
The most I would be willing to see is a statement that code external to
the C++ program is permitted to create objects compatible with C++'s object
model, on an implementation-defined basis. But we should not standardize
C-isms within actual C++ code.
--
---
You received this message because you are subscribed to the Google Groups
"ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/a/is
ocpp.org/group/std-discussion/.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Thiago Macieira

2017-01-17 18:31:45 UTC

Post by Demi Obenour
THIS. C++'s main selling point is C compatibility. Much code is REQUIRED
to compile as BOTH C and C++. Without changes.

Yes, but no.

Yes, I agree with your agreeing with Richard.

But no, there's not a lot of code that needs to compile as both C and C++.
That's limited to a few (static) inline functions in headers. It may be that
they're used extremely often, especially if they come from the standard C
library itself or from POSIX or a relevant standard like the ancillary socket
data payloads defined by RFC 3542 -- CMSG_DATA and CMSG_NXTHDR are *ugly*.

Richard's point, which I agree with, is that C++ needs to interoperate with C
libraries and vice-versa. That does not imply compiling a lot of code as
either.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Nicol Bolas

2017-01-17 18:44:13 UTC

Em terÃ§a-feira, 17 de janeiro de 2017, Ã s 13:20:10 PST, Demi Obenour

Post by Demi Obenour
THIS. C++'s main selling point is C compatibility. Much code is

REQUIRED

Post by Demi Obenour
to compile as BOTH C and C++. Without changes.

Yes, but no.
Yes, I agree with your agreeing with Richard.
But no, there's not a lot of code that needs to compile as both C and C++.
That's limited to a few (static) inline functions in headers. It may be that
they're used extremely often, especially if they come from the standard C
library itself or from POSIX or a relevant standard like the ancillary socket
data payloads defined by RFC 3542 -- CMSG_DATA and CMSG_NXTHDR are *ugly*.
Richard's point, which I agree with, is that C++ needs to interoperate with C
libraries and vice-versa. That does not imply compiling a lot of code as
either.

That I can agree with. The standard needs a clear statement on the ability
of non-C++ code to be able to create C++ objects.

Validating C-isms inside C++ code is not something we need to deal with.

From an interop perspective, the main problem there is dealing with
compatibility between structs. C can write data that is layout compatible
with C++, but unless they are both working on the same struct definition,
C++'s object model does not allow accessing an object that's merely layout
compatible with what's there.

So, to avoid having to memcpy from external objects into C++ objects, you
would need to have a way to effectively adopt that memory into a formal C++
object. I'm not sure what that ought to look like, but it would be a new
way of creating a C++ object.

I hesitate to call it "type punning," but that is essentially what it is.
Except that it is formalized within the object model and is only valid for
trivial types (if you have a non-trivial type that's trivially copyable,
you'll have to copy it to get this to work). And, like any form of
lifetime-ending-followed-by-restart, pointers/references/variables to the
old memory aren't valid for accessing the object's values anymore.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Richard Hodges

2017-01-17 18:56:59 UTC

There is already a mechanism in the language syntax for interoperability
with C.

extern "C" { // <-- this!

Struct Bar { double d; }
struct Foo { int bar_count; Bar[] bars }; // this is clearly an object
that C can initialise
Foo* makeFoo(); // make a foo with a number of Bars. N will be 1+
destroyFoo(Foo*);
};

// this should be predictable and correct in the standard
auto pfoo = std::unique_ptr<Foo, void(*)(Foo*)>(makeFoo(), &destroyFoo);
auto first = pfoo->bars;
auto last = first + pfoo->bar_count;
doBarThings(first, last);

As the standard is currently worded, this code is UB. That's unacceptable.

Em terÃ§a-feira, 17 de janeiro de 2017, Ã s 13:20:10 PST, Demi Obenour

Post by Demi Obenour
THIS. C++'s main selling point is C compatibility. Much code is

REQUIRED

Post by Demi Obenour
to compile as BOTH C and C++. Without changes.

Yes, but no.
Yes, I agree with your agreeing with Richard.
But no, there's not a lot of code that needs to compile as both C and C++.
That's limited to a few (static) inline functions in headers. It may be that
they're used extremely often, especially if they come from the standard C
library itself or from POSIX or a relevant standard like the ancillary socket
data payloads defined by RFC 3542 -- CMSG_DATA and CMSG_NXTHDR are *ugly*.
Richard's point, which I agree with, is that C++ needs to interoperate with C
libraries and vice-versa. That does not imply compiling a lot of code as
either.

That I can agree with. The standard needs a clear statement on the ability
of non-C++ code to be able to create C++ objects.
Validating C-isms inside C++ code is not something we need to deal with.
From an interop perspective, the main problem there is dealing with
compatibility between structs. C can write data that is layout compatible
with C++, but unless they are both working on the same struct definition,
C++'s object model does not allow accessing an object that's merely layout
compatible with what's there.
So, to avoid having to memcpy from external objects into C++ objects, you
would need to have a way to effectively adopt that memory into a formal C++
object. I'm not sure what that ought to look like, but it would be a new
way of creating a C++ object.
I hesitate to call it "type punning," but that is essentially what it is.
Except that it is formalized within the object model and is only valid for
trivial types (if you have a non-trivial type that's trivially copyable,
you'll have to copy it to get this to work). And, like any form of
lifetime-ending-followed-by-restart, pointers/references/variables to the
old memory aren't valid for accessing the object's values anymore.
--
---
You received this message because you are subscribed to the Google Groups
"ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/a/isocpp.org/group/std-
discussion/.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Andrey Semashev

2017-01-17 19:04:54 UTC

Post by Richard Hodges
There is already a mechanism in the language syntax for interoperability
with C.
extern "C" { // <-- this!
Struct Bar { double d; }
struct Foo { int bar_count; Bar[] bars }; // this is clearly an
object that C can initialise
Foo* makeFoo(); // make a foo with a number of Bars. N will be 1+
destroyFoo(Foo*);
};
// this should be predictable and correct in the standard
auto pfoo = std::unique_ptr<Foo, void(*)(Foo*)>(makeFoo(), &destroyFoo);
auto first = pfoo->bars;
auto last = first + pfoo->bar_count;
doBarThings(first, last);
As the standard is currently worded, this code is UB. That's unacceptable.

Trailing arrays are not supported in C++, AFAIK, so this won't compile
in C++.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Richard Hodges

2017-01-17 19:46:10 UTC

But it should be, because c++ should be compatible with c. The code has
clearlily been declared as external "C". So it's not a c++ struct, it's a c
struct.

It should obey the c memory model, and Interoperate correctly with c++. In
the same way that objective-c++ understands c, c++ and objective c.

unacceptable.
Trailing arrays are not supported in C++, AFAIK, so this won't compile
in C++.
--
---
You received this message because you are subscribed to the Google Groups
"ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at
https://groups.google.com/a/isocpp.org/group/std-discussion/.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Nicol Bolas

2017-01-17 19:55:16 UTC

Post by Richard Hodges
But it should be, because c++ should be compatible with c. The code has
clearlily been declared as external "C". So it's not a c++ struct, it's a c
struct.

C++ is not a superset of C and it *never has been*. Users should not expect
to be able to throw *any* C struct at C++ and have it work with it.

It should obey the c memory model, and Interoperate correctly with c++. In

Post by Richard Hodges
the same way that objective-c++ understands c, c++ and objective c.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Thiago Macieira

2017-01-17 20:29:00 UTC

Post by Nicol Bolas
Interop is one thing, expecting to shove C at a C++ compiler is another.
The latter has never been true.

void foo(int x[static 8]) {}
void bar(int n, int x[static n]) {}

Not to mention variables called "class", "new", "delete", etc.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Richard Hodges

2017-01-17 20:30:45 UTC

Post by Nicol Bolas
C++ is not a superset of C and it *never has been*. Users should not

expect to be able to throw *any* C struct at C++ and have it work with it.

And yet this is exactly what I can do, de-facto, today. And it is exactly
this undefined behaviour that the entire c++ enterprise depends on, today.

https://godbolt.org/g/rn188s

Code (proven on clang and g++):

#include <stdlib.h>

extern "C"
{

struct Foo {
int bars;
double bar[];
};

Foo *makeFoo() {
int a = 6;
auto vp = malloc(sizeof(Foo) + a * sizeof(double));
auto p = (Foo *) vp;
p->bars = a;
for (int i = 0; i < a; ++i) {
p->bar[i] = i * 2;
}
return p;
}
void deleteFoo(Foo *p) {
free(p);
}
}

#include <memory>
#include <iostream>
#include <algorithm>
#include <iterator>

struct FooDeleter {
void operator()(Foo *p) const {
deleteFoo(p);
}
};

int main() {
using fooptr = std::unique_ptr<Foo, FooDeleter>;
auto p = fooptr(makeFoo());

auto first = p->bar;
auto last = first + p->bars;
std::copy(first, last, std::ostream_iterator<double>(std::cout, ", "));
std::cout << std::endl;

}

Post by Richard Hodges
But it should be, because c++ should be compatible with c. The code has
clearlily been declared as external "C". So it's not a c++ struct, it's a c
struct.

C++ is not a superset of C and it *never has been*. Users should not
expect to be able to throw *any* C struct at C++ and have it work with it.
It should obey the c memory model, and Interoperate correctly with c++. In

Post by Richard Hodges
the same way that objective-c++ understands c, c++ and objective c.

Objective C is designed to be a pure superset of C. C++ is not designed to
be a pure superset of C. And that simply is not going to change.
Interop is one thing, expecting to shove C at a C++ compiler is another.
The latter has never been true.
C/C++ is not a real language, and it's time people finally accepted that.
--
---
You received this message because you are subscribed to the Google Groups
"ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/a/isocpp.org/group/std-
discussion/.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Richard Hodges

2017-01-17 20:59:12 UTC

All the standard needs is the addition of a paragraph that says:

'any type or intrinsic object declared inside extern "C" exists in the C
memory model (see ISO standard xxx). Members of objects of an extern "C"
class type shall behave as per the C language, previously cited. The C++
and C memory models must coexist in an unsurprising way'

End of problem.

Post by Nicol Bolas
C++ is not a superset of C and it *never has been*. Users should not

expect to be able to throw *any* C struct at C++ and have it work with it.
And yet this is exactly what I can do, de-facto, today. And it is exactly
this undefined behaviour that the entire c++ enterprise depends on, today.
https://godbolt.org/g/rn188s
#include <stdlib.h>
extern "C"
{
struct Foo {
int bars;
double bar[];
};
Foo *makeFoo() {
int a = 6;
auto vp = malloc(sizeof(Foo) + a * sizeof(double));
auto p = (Foo *) vp;
p->bars = a;
for (int i = 0; i < a; ++i) {
p->bar[i] = i * 2;
}
return p;
}
void deleteFoo(Foo *p) {
free(p);
}
}
#include <memory>
#include <iostream>
#include <algorithm>
#include <iterator>
struct FooDeleter {
void operator()(Foo *p) const {
deleteFoo(p);
}
};
int main() {
using fooptr = std::unique_ptr<Foo, FooDeleter>;
auto p = fooptr(makeFoo());
auto first = p->bar;
auto last = first + p->bars;
std::copy(first, last, std::ostream_iterator<double>(std::cout, ", "));
std::cout << std::endl;
}

Post by Richard Hodges
But it should be, because c++ should be compatible with c. The code has
clearlily been declared as external "C". So it's not a c++ struct, it's a c
struct.

C++ is not a superset of C and it *never has been*. Users should not
expect to be able to throw *any* C struct at C++ and have it work with it.
It should obey the c memory model, and Interoperate correctly with c++.

Post by Richard Hodges
In the same way that objective-c++ understands c, c++ and objective c.

Objective C is designed to be a pure superset of C. C++ is not designed
to be a pure superset of C. And that simply is not going to change.
Interop is one thing, expecting to shove C at a C++ compiler is another.
The latter has never been true.
C/C++ is not a real language, and it's time people finally accepted that.
--
---
You received this message because you are subscribed to the Google Groups
"ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/a/is
ocpp.org/group/std-discussion/.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Thiago Macieira

2017-01-18 05:15:00 UTC

Post by Richard Hodges
'any type or intrinsic object declared inside extern "C" exists in the C
memory model (see ISO standard xxx). Members of objects of an extern "C"
class type shall behave as per the C language, previously cited. The C++
and C memory models must coexist in an unsurprising way'

I'd rather not go there, for two reasons:
1) changing the rules for extern "C" is opening Pandora's box
2) it's not sufficient, as it's possible that C headers do:

struct Foo {
int bars;
double bar[];
};

#ifdef __cplusplus
extern "C" {
#endif

Foo *makeFoo();

#ifdef __cplusplus
]
#endif

I think we already have the rule we need: POD should behave like C. That's
what "POD" exists for anyway.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Nicol Bolas

2017-01-17 21:29:40 UTC

Post by Nicol Bolas
C++ is not a superset of C and it *never has been*. Users should not

That's not "undefined behavior"; those are compiler extensions. That's
different; compilers are allowed to give non-C++ syntax meaning if they so
choose.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Richard Hodges

2017-01-17 21:53:57 UTC

Post by Nicol Bolas
Um, no. This now requires that every C++ compiler *also* implement

<insert version here> of C.

This is not what I am saying. I am saying that the objects that are imbued
with extern "C" must exist in the C memory model. They already do, and the
entirety of the c++ world depends upon this fact today. This is an
inescapable truth.

Post by Nicol Bolas
Just because "c++ enterprise" depends on some non-C++ features doesn't

mean we should shove them into the standard.

The alternative is that almost every meaningful application and library in
existence is fundamentally non-portable. This is not in the interests of
c++ developers, users of their work, or indeed manufacturers of the
compilers. So I have to say, with respect, that you are mistaken. An ISO
standard describing a system built upon UB is meaningless because all
programs become strictly non-portable. See above.

Post by Nicol Bolas
Furthermore, declaring that the two *object* models "must coexist in an

unsurprising way" basically says nothing. It's about as useful for deciding
on behavior as the wording on pointer-to-`intptr_t` conversions.

You know full well that I am paraphrasing.

I'll make you a bet. Let's put this question to a poll of c++ developers
(say with more than 4 years' experience). I'll give you even money on any
bet you care to take that I my position would win by a ratio exceeding 7:3

It's what the language does. It's what the entire developer base expects
the compiler to do. It is the de-facto truth of c++. The standard is
currently perverse in stating otherwise.

The committee should hang its head in shame.

Post by Nicol Bolas
C++ is not a superset of C and it *never has been*. Users should not

That's not "undefined behavior"; those are compiler extensions. That's
different; compilers are allowed to give non-C++ syntax meaning if they so
choose.

Um, no. This now requires that every C++ compiler *also* implement
<insert version here> of C. Just because "c++ enterprise" depends on some
non-C++ features doesn't mean we should shove them into the standard.
Furthermore, declaring that the two *object* models "must coexist in an
unsurprising way" basically says nothing. It's about as useful for deciding
on behavior as the wording on pointer-to-`intptr_t` conversions.
--
---
You received this message because you are subscribed to the Google Groups
"ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/a/isocpp.org/group/std-
discussion/.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Demi Obenour

2017-01-17 22:13:10 UTC

Agreed. Furthermore, I claim that code that blasts bits into memory and
treats it as an object will always be there. There is no other good way to
achieve guaranteed zero-copy operation.

Post by Nicol Bolas
Um, no. This now requires that every C++ compiler *also* implement

<insert version here> of C.
This is not what I am saying. I am saying that the objects that are imbued
with extern "C" must exist in the C memory model. They already do, and the
entirety of the c++ world depends upon this fact today. This is an
inescapable truth.

Post by Nicol Bolas
Just because "c++ enterprise" depends on some non-C++ features doesn't

mean we should shove them into the standard.
The alternative is that almost every meaningful application and library in
existence is fundamentally non-portable. This is not in the interests of
c++ developers, users of their work, or indeed manufacturers of the
compilers. So I have to say, with respect, that you are mistaken. An ISO
standard describing a system built upon UB is meaningless because all
programs become strictly non-portable. See above.

Post by Nicol Bolas
Furthermore, declaring that the two *object* models "must coexist in an

unsurprising way" basically says nothing. It's about as useful for deciding
on behavior as the wording on pointer-to-`intptr_t` conversions.
You know full well that I am paraphrasing.
I'll make you a bet. Let's put this question to a poll of c++ developers
(say with more than 4 years' experience). I'll give you even money on any
bet you care to take that I my position would win by a ratio exceeding 7:3
It's what the language does. It's what the entire developer base expects
the compiler to do. It is the de-facto truth of c++. The standard is
currently perverse in stating otherwise.
The committee should hang its head in shame.

Post by Nicol Bolas
C++ is not a superset of C and it *never has been*. Users should not

That's not "undefined behavior"; those are compiler extensions. That's
different; compilers are allowed to give non-C++ syntax meaning if they so
choose.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Martin Ba

2017-01-17 22:49:39 UTC

Post by Nicol Bolas
Um, no. This now requires that every C++ compiler *also* implement

<insert version here> of C.
This is not what I am saying. I am saying that the objects that are imbued
with extern "C" must exist in the C memory model. They already do, and the
entirety of the c++ world depends upon this fact today. This is an
inescapable truth.

Post by Nicol Bolas
Just because "c++ enterprise" depends on some non-C++ features doesn't

mean we should shove them into the standard.
The alternative is that almost every meaningful application and library in
existence is fundamentally non-portable. This is not in the interests of
c++ developers, users of their work, or indeed manufacturers of the
compilers. So I have to say, with respect, that you are mistaken. An ISO
standard describing a system built upon UB is meaningless because all
programs become strictly non-portable. See above.

This!

Post by Nicol Bolas
Furthermore, declaring that the two *object* models "must coexist in an

unsurprising way" basically says nothing. It's about as useful for deciding
on behavior as the wording on pointer-to-`intptr_t` conversions.
You know full well that I am paraphrasing.
I'll make you a bet. Let's put this question to a poll of c++ developers
(say with more than 4 years' experience). I'll give you even money on any
bet you care to take that I my position would win by a ratio exceeding 7:3
It's what the language does. It's what the entire developer base expects
the compiler to do. It is the de-facto truth of c++. The standard is
currently perverse in stating otherwise.

Exactly my thoughts.

Post by Richard Hodges
The committee should hang its head in shame.

Who knows? Has anyone that actually participates in CWG commented on the
original question yet, namely whether this has been discussed (like this
thread here) and whether there have been any conclusions?

It seems to be a hard problem to solve with very little immediate practical
benefit, so I'm rather not surprised it hasn't been done yet (on the other
hand, someone found time for P0137R1).

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Jens Maurer

2017-01-18 07:36:41 UTC

Post by Martin Ba
Who knows? Has anyone that actually participates in CWG commented on the
original question yet, namely whether this has been discussed (like this
thread here) and whether there have been any conclusions?

As I said before, I think I can say that CWG believes the post-P0137R1
wording in this area accurately reflects the intent of the committee.

I've re-read the discussion notes on P0137R1, and it seems the question
of "malloc", while somewhat related to P0137R1, has not been discussed
specifically. ("malloc" is also sort-of out-of-scope for P0137R1.)

I believe any change in this area needs a paper (possibly even reviewed
by EWG, not just CWG), as opposed to some ranting on a mailing list,
so if you feel strongly that a change is required for "malloc", please
propose one (preferably with wording changes) in a paper, for review
at one of the next WG21 meetings.

Jens

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Richard Hodges

2017-01-18 07:57:56 UTC

The committee has already rejected async io and continuation futures -
fundamental building blocks that every other language takes for granted.
It's had years to come to its senses. All it had to do was adopt some
longstanding boost constructs into the language without watering them down
to the point of uselessness. It couldn't even do that. It's clearly not fit
for purpose.

Who's in charge? what's his number?

If they're still fumbling over whether as cast is really a cast, or
mentally pleasuring themselves over whether they're really allowed to
implement std::vector or not, then I'm afraid the metaphors would have to
be that the band is still playing while the titanic is being steered into
an iceberg. Honestly, from the point of view of a long-time user and
evangelist of this language, it just makes you sound like a bunch of nerds
wondering whether you're allowed to look inside the school computer, or
whether you're going to get a scolding from teacher.

A new committee needs to re recruited. People who actually want to achieve
something, to move the language forward to the point where it becomes the
de-facto choice for any project.

It is my view that c++ is the best language for performant, cross platform,
licence-free development. This is despite the efforts of the committee, not
because of them.

I don't care when the "problem" of "is my memory really my memory" came up.
All I care about is that after 21 years, this trivial, petty, non-issue
gets closed (by deleting the ridiculous notion that you have to call new on
an int, are you all mad??) and the language moves on.

Damn right I'll step up. Who do I need to speak to?

Post by Jens Maurer

As I said before, I think I can say that CWG believes the post-P0137R1
wording in this area accurately reflects the intent of the committee.
I've re-read the discussion notes on P0137R1, and it seems the question
of "malloc", while somewhat related to P0137R1, has not been discussed
specifically. ("malloc" is also sort-of out-of-scope for P0137R1.)
I believe any change in this area needs a paper (possibly even reviewed
by EWG, not just CWG), as opposed to some ranting on a mailing list,
so if you feel strongly that a change is required for "malloc", please
propose one (preferably with wording changes) in a paper, for review
at one of the next WG21 meetings.
Jens
--
---
You received this message because you are subscribed to the Google Groups
"ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at
https://groups.google.com/a/isocpp.org/group/std-discussion/.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

b***@gmail.com

2017-01-18 13:05:31 UTC

Post by Richard Hodges
Damn right I'll step up. Who do I need to speak to?

Probably a good therapist. Insulting and trivializing people isn't exactly
an efficient way to reach consensus.

Once you do that, you can follow the steps
here: https://isocpp.org/std/submit-a-proposal

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Richard Hodges

2017-01-18 13:19:41 UTC

This isn't going to help. There little point submitting proposals to a
committee that is resistant to progress. We have watched with dismay as
networking has foundered, concepts still aren't in, we still have to call
new on an int-aligned memory location to make it an int (try explaining
that to someone other than a committee member, they'll think you're nuts).

If any of you on the committee feeling insulted, you need to stop being so
thin skinned. This can't be the first time someone from the user community
has stood up and called you out on the current failure.

Sorry chaps, my livelihood, and that of hundreds of thousands of others
depends upon you doing your job properly. When you do wrong, you need to
hear it.

I am very happy to take a driving seat on the committee if that is what it
will take to rescue this language from ththe current half-decade of
navel-gazing.

So, who's in charge? What's his phone number?

Herb Sutter? Bjarn Stroustrup? Drop me a line. If it's not already obvious
how to fix that standard I'll explain it in 20 minutes. Then we can move on
and add some useful features, like, what's that thing that every computer
has? Oh yes. Networking.

Time to step up to the plate gentlemen. The language needs help.

Post by Richard Hodges
Damn right I'll step up. Who do I need to speak to?
Probably a good therapist. Insulting and trivializing people isn't exactly
an efficient way to reach consensus.
https://isocpp.org/std/submit-a-proposal
--
---
You received this message because you are subscribed to the Google Groups
"ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at
https://groups.google.com/a/isocpp.org/group/std-discussion/.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Tom Honermann

2017-01-18 14:40:34 UTC

Post by Richard Hodges
So, who's in charge?

You are. The committee is composed of volunteers.

If you aren't happy with the direction the committee is taking, I
suggest you attend a meeting, get to know the members, understand and
appreciate their concerns and motivations, and contribute to the hard
work required to move forward a language that millions of people rely on.

https://isocpp.org/std/meetings-and-participation/

Tom.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Nicol Bolas

2017-01-18 16:40:13 UTC

Post by Richard Hodges
The committee has already rejected async io and continuation futures

... what? Since when were these *rejected*?

I don't know of any async io proposals. Unless you're talking about the
Networking TS, which is... a Technical Specification. That's *the exact
opposite* of being "rejected".

As for continuation futures, those too were not "rejected". I believe they
were part of library fundamentals v2, which was not adopted into C++17. But
that's far from being "rejected".

So I really have no idea what you're talking about.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Richard Hodges

2017-01-18 16:46:48 UTC

Post by Nicol Bolas
those too were not "rejected"

"not adopted" means exactly the same thing as "rejected" when you are
waiting for a feature to come out so that you can standardise code across
platforms.

Every 3 year delay in the committee adopting a perfectly good feature is a
waste of hundreds of thousands of man-hours (probably much more) across the
c++ developer community.

Pedantically splitting hairs with concerned observers is the kind of
behaviour that causes delays in the progress of c++. Let's not do that
either.

Post by Richard Hodges
The committee has already rejected async io and continuation futures

... what? Since when were these *rejected*?
I don't know of any async io proposals. Unless you're talking about the
Networking TS, which is... a Technical Specification. That's *the exact
opposite* of being "rejected".
As for continuation futures, those too were not "rejected". I believe they
were part of library fundamentals v2, which was not adopted into C++17. But
that's far from being "rejected".
So I really have no idea what you're talking about.

Post by Richard Hodges
--

---
You received this message because you are subscribed to the Google Groups
"ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/a/isocpp.org/group/std-
discussion/.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Nicol Bolas

2017-01-18 17:10:39 UTC

Post by Nicol Bolas
those too were not "rejected"

"not adopted" means exactly the same thing as "rejected" when you are
waiting for a feature to come out so that you can standardise code across
platforms.

No, it really doesn't.

If something is rejected, that means it's flat-out not gonna happen. So you
should stop waiting for it and try to work around the problem. If it was
standardized in a TS, then it's going to happen, but not yet. Thus, waiting
is a legitimate action.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Ville Voutilainen

2017-01-17 20:24:02 UTC

Post by Richard Hodges
But it should be, because c++ should be compatible with c. The code has
clearlily been declared as external "C". So it's not a c++ struct, it's a c
struct.

extern "C" doesn't mean that the code inside it is compiled as C.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Thiago Macieira

2017-01-17 18:26:18 UTC

Em terça-feira, 17 de janeiro de 2017, às 17:32:52 PST, Richard Hodges

Post by Richard Hodges
Note that I am asserting *should absolutely* - a very strong statement.
This is because we absolutely cannot move away from c. There are no c++
operating systems. Therefore all useful libraries are written with C
interfaces.

Side-note: Microsoft's UCRT library implements the C library in C++. Yes, it
has an extern "C" interface, but it's actually a C++ library.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Nicol Bolas

2017-01-17 18:28:43 UTC

Post by Richard Hodges
I cannot see any reasonable argument that pointer arithmetic should not be
allowed to work on consecutive objects.

And nobody has made such an argument. Indeed, I'm pretty sure that I stated
quite the opposite. Though for very different reasons and different
restrictions.

Post by Richard Hodges
int* p = get_ints_from_c(); *(++p); *should absolutely* be defined
behaviour in c++ provided there is actually some memory at
std::addressof(*p) + sizeof(p); - there is no conceivable reason why it
should not.
Note that I am asserting *should absolutely* - a very strong statement.
This is because we absolutely cannot move away from c. There are no c++
operating systems. Therefore all useful libraries are written with C
interfaces. Thousands of c++ wrapper libraries exist to turn those C
interfaces back into c++. We don't do that because we want to. We do that
because C++ is not suitable creating portable object libraries, having no
modules or common ABI.
By all means lets talk about moving forward - after we have modules,
defined ABIs, an agreed-upon means to transmitting exceptions and so on.
Until then, the entire foundation of our C++ universe is C. To try to
pretend otherwise is a fallacy.
OK, it's "difficult" to marry the c++ abstract machine model with the C
memory model. So what? That doesn't mean that it should not be done.

OK, so explain what we will gain by doing all of this work. How will it
make my currently functional code faster and/or better? How will it make my
programs more correct? How will it improve the C++ object model in ways
that are useful for actual C++ programs?

The status quo is adequately functional. And if C is as entwined with C++
as you believe, then no compiler vendor is going to break the world with
"optimizations" that don't actually make things more optimal.

Clearly the definition of the C++ abstract machine needs to be revised, or

Post by Richard Hodges
made more granular. Difficulty does not come into it.
Should bitwise copies of non-trivial objects be allowed? Of course not.

But they are. Right now.

Bitwise copies of non-trivially *copyable* types are forbidden. Trivial
types <http://en.cppreference.com/w/cpp/concept/TrivialType> are a subset
of trivially copyable
<http://en.cppreference.com/w/cpp/concept/TriviallyCopyable> ones.

We can express the reason why not as a high level "because of memory model

Post by Richard Hodges
concerns" or we can be truthful: "because the pointers will be wrong and
your double delete will crash the program".

... that's not why we forbid non-trivially copyable types from being
bit-copied.

We do it because if you have written a copy/move constructor/assignment
operator or a destructor, then you clearly have needs for your object that
cannot be satisfied by mere bit-fiddling. A live destructor represents that
dropping an object on the floor does not represent a valid way to get rid
of it. Live copy/move means that there is some internal resource your
object is managing, and therefore bit-copying is incapable of doing so.

"Because of memory model concerns" is *the truth*. To say it's because of
pointer stuff or whatever is to lie to the user, to declare that "trivially
copyable" is some arbitrary construct that has no intrinsic meaning.

A trivial type is a type for which any value is equally valid. A trivially
copyable type is a type for which bitwise-copying makes sense. These are
legitimate constructs that have a representation in your code structure.

You could write a non-trivially copyable type, such that your
implementation of one of the non-trivial functions doesn't obstruct bitwise
copying. But C++ *cannot* know that; that is an implementation detail of
your function which cannot be tested for. So your type is forbidden from
doing so, even though by your "low level high level" reasoning, it should
be allowed.

That's a good thing. It makes our rules simple and easily testible and
verifiable. It allows us to write `std::copy` and `std::vector`
implementations that perform optimally for types where we know we can
perform bitcopies. And for those of us who want to take advantage of such
optimizations, we know exactly what we have to do to create such types.

Your kind of "wild west" thinking doesn't allow for such simple, easily
testible constructs. For your kind of coding, the user would have to
explicitly inform `copy` or `vector` that it should perform bitcopies.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Martin Ba

2017-01-17 22:45:35 UTC

Post by Richard Hodges
I cannot see any reasonable argument that pointer arithmetic should not
be allowed to work on consecutive objects.

And nobody has made such an argument. Indeed, I'm pretty sure that I
stated quite the opposite. Though for very different reasons and different
restrictions.

What we will gain is a Standard that is not contradicting reality, which is
*at least* a marketing asset for the language. (compare: isocpp.org)

What we will gain is people writing reasonable real world "low level" code
*not being told* that their code is UB and that they should resort to
memcpy and no-op placement-new contortions - this is at least an asset wrt.
the learning curve of the language.

What we will gain is not having to spend time on rather fruitless
discussion like this one here.

Post by Nicol Bolas
The status quo is adequately functional. And if C is as entwined with C++
as you believe, then no compiler vendor is going to break the world with
"optimizations" that don't actually make things more optimal.

The status quo in reality is functional. The Standard contradicts reality
in this regard. That it works everywhere in practice, and can be expected
to do so, is only an argument for the priority of fixing this, not an
argument for not fixing the Standard.

I have to say I do not quite follow you argumentation wrt. this: on one
hand you seem to care very much about the Standard supplying a useful and
consistent object model, but on the other hand, so seem to say that the
places where this shiningly consistent model is violated by a huge fraction
of programs in existence don't matter because they will continue to work
anyway.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Nicol Bolas

2017-01-18 16:30:43 UTC

Post by Richard Hodges
I cannot see any reasonable argument that pointer arithmetic should not
be allowed to work on consecutive objects.

And nobody has made such an argument. Indeed, I'm pretty sure that I
stated quite the opposite. Though for very different reasons and different
restrictions.

What we will gain is a Standard that is not contradicting reality, which
is *at least* a marketing asset for the language. (compare: isocpp.org)

Exactly how do you market something that by definition doesn't actually
change anything? Behold, the new C++, with 100% more of the exact same
stuff you had before.

Post by Martin Ba
What we will gain is people writing reasonable real world "low level" code
*not being told* that their code is UB and that they should resort to
memcpy and no-op placement-new contortions - this is at least an asset wrt.
the learning curve of the language.

Nonsense. It is only a learning curve problem for *C programmers*. For
people from other languages, or none at all, they learn what they are told.

Native C++ programmers, who have not been exposed to C-isms, are highly
unlikely to resort to casting memory and so forth.

What we will gain is not having to spend time on rather fruitless

Post by Martin Ba
discussion like this one here.

The status quo in reality is functional. The Standard contradicts reality
in this regard. That it works everywhere in practice, and can be expected
to do so, is only an argument for the priority of fixing this, not an
argument for not fixing the Standard.
I have to say I do not quite follow you argumentation wrt. this: on one
hand you seem to care very much about the Standard supplying a useful and
consistent object model, but on the other hand, so seem to say that the
places where this shiningly consistent model is violated by a huge fraction
of programs in existence don't matter because they will continue to work
anyway.

Essentially yes, but there's more to it than that.

The problem basically boils down to this: C++ makes C-isms undefined
behavior, but a lot of code relies on C-isms, so compilers aren't free to
discard them or do anything about them. The solutions being tossed about
here are that we should make them well-defined behavior.

I have a different solution. Instead of promoting garbage C-isms like
pointer casting and so forth, we make C++ equivalents. Placement `new` is
one such C++-ism which allows the creation of C++ objects in arbitrary
memory. But we can add many more.

If people need a way to take memory that has been filled in from external
code and use that as a C++ object which is compatible with the layout of
that memory, lets provide them with a function that does that. If people
need a way to initialize an object directly from compatible data externally
provided, let's provide them a way to do that. Let's take all of the useful
C-isms and provide C++ ways to do them, rather than promoting pointer
casting and whatnot as good code.

In the end, if we give low-level programmers ways to work *within* the C++
memory model that don't make their code slower, then they ought to stop
using C-isms. And those few who continue to rely on C-isms in C++ will just
be no different than any other code that relies on UB.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Richard Hodges

2017-01-18 16:41:11 UTC

Post by Nicol Bolas
Essentially yes, but there's more to it than that.

Only if you enjoy complicating your life.

Post by Nicol Bolas
The problem basically boils down to this: C++ makes C-isms undefined

behavior, but a lot of code relies on C-isms, so compilers aren't free to
discard them or do anything about them. The solutions being tossed about
here are that we should make them well-defined behavior.

The problem is that intrinsic types are not objects, and neither are PODs.
To treat them the same is counter-factual. aligned memory is inherently a
union of all PODs that will fit. Make it so in the standard. End the
argument forever.

This makes the C++ behaviour the same as C behaviour when intrinsics and
PODS are mapped onto memory. It's logical, everyone does it anyway, and
it's never going away in gcc or clang. End of problem. Lets get on with
something new.

Post by Nicol Bolas
I have a different solution. Instead of promoting garbage C-isms like

pointer casting and so forth, we make C++ equivalents. Placement `new` is
one such C++-ism which allows the creation of C++ objects in arbitrary
memory. But we can add many more.

NO - because that just adds more useless work for programmers. It's the
reverse of what *auto* does - which is make life easier and better. Useless
work like having to formally introduce storage (which is what you're
suggesting) is what COBOL and Pascal did. They're dead now. Let's not do
that.

Post by Nicol Bolas
If people need a way to take memory that has been filled in from external

code and use that as a C++ object which is compatible with the layout of
that memory, lets provide them with a function that does that. If people
need a way to initialize an object directly from compatible data externally
provided, let's provide them a way to do that. Let's take all of the useful
C-isms and provide C++ ways to do them, rather than promoting pointer
casting and whatnot as good code.

No need for any of that. It already happens in gcc. gcc *is* the standard.
The ISO standard needs to catch up.

Post by Richard Hodges
I cannot see any reasonable argument that pointer arithmetic should not
be allowed to work on consecutive objects.

And nobody has made such an argument. Indeed, I'm pretty sure that I
stated quite the opposite. Though for very different reasons and different
restrictions.

What we will gain is a Standard that is not contradicting reality, which
is *at least* a marketing asset for the language. (compare: isocpp.org)

Exactly how do you market something that by definition doesn't actually
change anything? Behold, the new C++, with 100% more of the exact same
stuff you had before.

Post by Martin Ba
What we will gain is people writing reasonable real world "low level"
code *not being told* that their code is UB and that they should resort to
memcpy and no-op placement-new contortions - this is at least an asset wrt.
the learning curve of the language.

Nonsense. It is only a learning curve problem for *C programmers*. For
people from other languages, or none at all, they learn what they are told.
Native C++ programmers, who have not been exposed to C-isms, are highly
unlikely to resort to casting memory and so forth.
What we will gain is not having to spend time on rather fruitless

Post by Martin Ba
discussion like this one here.

Post by Nicol Bolas
The status quo is adequately functional. And if C is as entwined with
C++ as you believe, then no compiler vendor is going to break the world
with "optimizations" that don't actually make things more optimal.

The status quo in reality is functional. The Standard contradicts reality
in this regard. That it works everywhere in practice, and can be expected
to do so, is only an argument for the priority of fixing this, not an
argument for not fixing the Standard.
I have to say I do not quite follow you argumentation wrt. this: on one
hand you seem to care very much about the Standard supplying a useful and
consistent object model, but on the other hand, so seem to say that the
places where this shiningly consistent model is violated by a huge fraction
of programs in existence don't matter because they will continue to work
anyway.

Essentially yes, but there's more to it than that.
The problem basically boils down to this: C++ makes C-isms undefined
behavior, but a lot of code relies on C-isms, so compilers aren't free to
discard them or do anything about them. The solutions being tossed about
here are that we should make them well-defined behavior.
I have a different solution. Instead of promoting garbage C-isms like
pointer casting and so forth, we make C++ equivalents. Placement `new` is
one such C++-ism which allows the creation of C++ objects in arbitrary
memory. But we can add many more.
If people need a way to take memory that has been filled in from external
code and use that as a C++ object which is compatible with the layout of
that memory, lets provide them with a function that does that. If people
need a way to initialize an object directly from compatible data externally
provided, let's provide them a way to do that. Let's take all of the useful
C-isms and provide C++ ways to do them, rather than promoting pointer
casting and whatnot as good code.
In the end, if we give low-level programmers ways to work *within* the
C++ memory model that don't make their code slower, then they ought to stop
using C-isms. And those few who continue to rely on C-isms in C++ will just
be no different than any other code that relies on UB.
--
---
You received this message because you are subscribed to the Google Groups
"ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/a/isocpp.org/group/std-
discussion/.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Robert Haberlach

2017-01-18 17:10:35 UTC

Post by Nicol Bolas
Essentially yes, but there's more to it than that.

Only if you enjoy complicating your life.

Post by Nicol Bolas
The problem basically boils down to this: C++ makes C-isms undefined

Are you stoned?

Post by Richard Hodges
Make it so in the standard. End the argument forever.
This makes the C++ behaviour the same as C behaviour when intrinsics
and PODS are mapped onto memory. It's logical, everyone does it
anyway, and it's never going away in gcc or clang. End of problem.
Lets get on with something new.

Post by Nicol Bolas
I have a different solution. Instead of promoting garbage C-isms

like pointer casting and so forth, we make C++ equivalents. Placement
`new` is one such C++-ism which allows the creation of C++ objects in
arbitrary memory. But we can add many more.
NO - because that just adds more useless work for programmers. It's
the reverse of what /auto/ does - which is make life easier and
better. Useless work like having to formally introduce storage (which
is what you're suggesting) is what COBOL and Pascal did. They're dead
now. Let's not do that.

Post by Nicol Bolas
If people need a way to take memory that has been filled in from

external code and use that as a C++ object which is compatible with
the layout of that memory, lets provide them with a function that does
that. If people need a way to initialize an object directly from
compatible data externally provided, let's provide them a way to do
that. Let's take all of the useful C-isms and provide C++ ways to do
them, rather than promoting pointer casting and whatnot as good code.
No need for any of that. It already happens in gcc. gcc *is* the
standard. The ISO standard needs to catch up.
On Tuesday, January 17, 2017 at 11:32:55 AM UTC-5, Richard
I cannot see any reasonable argument that pointer
arithmetic should not be allowed to work on
consecutive objects.
And nobody has made such an argument. Indeed, I'm pretty
sure that I stated quite the opposite. Though for very
different reasons and different restrictions.
int* p = get_ints_from_c(); *(++p); /should
absolutely/ be defined behaviour in c++ provided there
is actually some memory at std::addressof(*p) +
sizeof(p); - there is no conceivable reason why it
should not.
Note that I am asserting *should absolutely* - a very
strong statement. This is because we absolutely cannot
move away from c. There are no c++ operating systems.
Therefore all useful libraries are written with C
interfaces. Thousands of c++ wrapper libraries exist
to turn those C interfaces back into c++. We don't do
that because we want to. We do that because C++ is not
suitable creating portable object libraries, having no
modules or common ABI.
By all means lets talk about moving forward - after we
have modules, defined ABIs, an agreed-upon means to
transmitting exceptions and so on.
Until then, the entire foundation of our C++ universe
is C. To try to pretend otherwise is a fallacy.
OK, it's "difficult" to marry the c++ abstract machine
model with the C memory model. So what? That doesn't
mean that it should not be done.
OK, so explain what we will gain by doing all of this
work. How will it make my currently functional code faster
and/or better? How will it make my programs more correct?
How will it improve the C++ object model in ways that are
useful for actual C++ programs?
What we will gain is a Standard that is not contradicting
reality, which is *at least* a marketing asset for the
language. (compare: isocpp.org <http://isocpp.org>)
Exactly how do you market something that by definition doesn't
actually change anything? Behold, the new C++, with 100% more of
the exact same stuff you had before.
What we will gain is people writing reasonable real world "low
level" code *not being told* that their code is UB and that
they should resort to memcpy and no-op placement-new
contortions - this is at least an asset wrt. the learning
curve of the language.
Nonsense. It is only a learning curve problem for /C programmers/.
For people from other languages, or none at all, they learn what
they are told.
Native C++ programmers, who have not been exposed to C-isms, are
highly unlikely to resort to casting memory and so forth.
What we will gain is not having to spend time on rather
fruitless discussion like this one here.
The status quo is adequately functional. And if C is as
entwined with C++ as you believe, then no compiler vendor
is going to break the world with "optimizations" that
don't actually make things more optimal.
The status quo in reality is functional. The Standard
contradicts reality in this regard. That it works everywhere
in practice, and can be expected to do so, is only an argument
for the priority of fixing this, not an argument for not
fixing the Standard.
I have to say I do not quite follow you argumentation wrt.
this: on one hand you seem to care very much about the
Standard supplying a useful and consistent object model, but
on the other hand, so seem to say that the places where this
shiningly consistent model is violated by a huge fraction of
programs in existence don't matter because they will continue
to work anyway.
Essentially yes, but there's more to it than that.
The problem basically boils down to this: C++ makes C-isms
undefined behavior, but a lot of code relies on C-isms, so
compilers aren't free to discard them or do anything about them.
The solutions being tossed about here are that we should make them
well-defined behavior.
I have a different solution. Instead of promoting garbage C-isms
like pointer casting and so forth, we make C++ equivalents.
Placement `new` is one such C++-ism which allows the creation of
C++ objects in arbitrary memory. But we can add many more.
If people need a way to take memory that has been filled in from
external code and use that as a C++ object which is compatible
with the layout of that memory, lets provide them with a function
that does that. If people need a way to initialize an object
directly from compatible data externally provided, let's provide
them a way to do that. Let's take all of the useful C-isms and
provide C++ ways to do them, rather than promoting pointer casting
and whatnot as good code.
In the end, if we give low-level programmers ways to work /within/
the C++ memory model that don't make their code slower, then they
ought to stop using C-isms. And those few who continue to rely on
C-isms in C++ will just be no different than any other code that
relies on UB.
--
---
You received this message because you are subscribed to the Google
Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it,
Visit this group at
https://groups.google.com/a/isocpp.org/group/std-discussion/
<https://groups.google.com/a/isocpp.org/group/std-discussion/>.
--
---
You received this message because you are subscribed to the Google
Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send
Visit this group at
https://groups.google.com/a/isocpp.org/group/std-discussion/.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Richard Hodges

2017-01-18 19:43:49 UTC

Post by Robert Haberlach
Are you stoned?

No, but if you ever visit Spain I can show you where you may become so
(quite legally) if you wish.

What is your problem with memory being... memory?

on a system where ints are 32 bits, 32-bit words are addressable without
bitwise arithmetic, and the compiler deems that 128 bits is a reasonable
alignment strategy...:

struct A {
int a;
int b[2];
};

struct B {
int a[2];
int b;
};

... both A and B occupy 128 bits. The value of the last 32 bits is
irrelevant.

There is no reason whatsoever (other than handwaving from the 'c++ is not a
low level language crowd') that they should not be a union of each other.

It makes no difference to optimisers, no difference to threading, no
difference to anything other than the sensibilities of snowflake theorists.

Be a man. Embrace your memory.

Post by Nicol Bolas
Essentially yes, but there's more to it than that.

Only if you enjoy complicating your life.

Post by Nicol Bolas
The problem basically boils down to this: C++ makes C-isms undefined

behavior, but a lot of code relies on C-isms, so compilers aren't free to
discard them or do anything about them. The solutions being tossed about
here are that we should make them well-defined behavior.
The problem is that intrinsic types are not objects, and neither are PODs.
To treat them the same is counter-factual. aligned memory is inherently a
union of all PODs that will fit.
Are you stoned?
Make it so in the standard. End the argument forever.
This makes the C++ behaviour the same as C behaviour when intrinsics and
PODS are mapped onto memory. It's logical, everyone does it anyway, and
it's never going away in gcc or clang. End of problem. Lets get on with
something new.

Post by Nicol Bolas
I have a different solution. Instead of promoting garbage C-isms like

pointer casting and so forth, we make C++ equivalents. Placement `new` is
one such C++-ism which allows the creation of C++ objects in arbitrary
memory. But we can add many more.
NO - because that just adds more useless work for programmers. It's the
reverse of what *auto* does - which is make life easier and better.
Useless work like having to formally introduce storage (which is what
you're suggesting) is what COBOL and Pascal did. They're dead now. Let's
not do that.

Post by Nicol Bolas
If people need a way to take memory that has been filled in from

Post by Richard Hodges
I cannot see any reasonable argument that pointer arithmetic should
not be allowed to work on consecutive objects.

And nobody has made such an argument. Indeed, I'm pretty sure that I
stated quite the opposite. Though for very different reasons and different
restrictions.

Post by Richard Hodges
int* p = get_ints_from_c(); *(++p); *should absolutely* be defined
behaviour in c++ provided there is actually some memory at
std::addressof(*p) + sizeof(p); - there is no conceivable reason why it
should not.
Note that I am asserting *should absolutely* - a very strong
statement. This is because we absolutely cannot move away from c. There are
no c++ operating systems. Therefore all useful libraries are written with C
interfaces. Thousands of c++ wrapper libraries exist to turn those C
interfaces back into c++. We don't do that because we want to. We do that
because C++ is not suitable creating portable object libraries, having no
modules or common ABI.
By all means lets talk about moving forward - after we have modules,
defined ABIs, an agreed-upon means to transmitting exceptions and so on.
Until then, the entire foundation of our C++ universe is C. To try to
pretend otherwise is a fallacy.
OK, it's "difficult" to marry the c++ abstract machine model with the
C memory model. So what? That doesn't mean that it should not be done.

What we will gain is a Standard that is not contradicting reality, which
is *at least* a marketing asset for the language. (compare: isocpp.org)

Exactly how do you market something that by definition doesn't actually
change anything? Behold, the new C++, with 100% more of the exact same
stuff you had before.

Post by Martin Ba
What we will gain is people writing reasonable real world "low level"
code *not being told* that their code is UB and that they should resort to
memcpy and no-op placement-new contortions - this is at least an asset wrt.
the learning curve of the language.

Nonsense. It is only a learning curve problem for *C programmers*. For
people from other languages, or none at all, they learn what they are told.
Native C++ programmers, who have not been exposed to C-isms, are highly
unlikely to resort to casting memory and so forth.
What we will gain is not having to spend time on rather fruitless

Post by Martin Ba
discussion like this one here.

Post by Nicol Bolas
The status quo is adequately functional. And if C is as entwined with
C++ as you believe, then no compiler vendor is going to break the world
with "optimizations" that don't actually make things more optimal.

The status quo in reality is functional. The Standard contradicts
reality in this regard. That it works everywhere in practice, and can be
expected to do so, is only an argument for the priority of fixing this, not
an argument for not fixing the Standard.
I have to say I do not quite follow you argumentation wrt. this: on one
hand you seem to care very much about the Standard supplying a useful and
consistent object model, but on the other hand, so seem to say that the
places where this shiningly consistent model is violated by a huge fraction
of programs in existence don't matter because they will continue to work
anyway.

Essentially yes, but there's more to it than that.
The problem basically boils down to this: C++ makes C-isms undefined
behavior, but a lot of code relies on C-isms, so compilers aren't free to
discard them or do anything about them. The solutions being tossed about
here are that we should make them well-defined behavior.
I have a different solution. Instead of promoting garbage C-isms like
pointer casting and so forth, we make C++ equivalents. Placement `new` is
one such C++-ism which allows the creation of C++ objects in arbitrary
memory. But we can add many more.
If people need a way to take memory that has been filled in from external
code and use that as a C++ object which is compatible with the layout of
that memory, lets provide them with a function that does that. If people
need a way to initialize an object directly from compatible data externally
provided, let's provide them a way to do that. Let's take all of the useful
C-isms and provide C++ ways to do them, rather than promoting pointer
casting and whatnot as good code.
In the end, if we give low-level programmers ways to work *within* the
C++ memory model that don't make their code slower, then they ought to stop
using C-isms. And those few who continue to rely on C-isms in C++ will just
be no different than any other code that relies on UB.
--
---
You received this message because you are subscribed to the Google Groups
"ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/a/is
ocpp.org/group/std-discussion/.

--
---
You received this message because you are subscribed to the Google Groups
"ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/a/isocpp.org/group/std-
discussion/.
--
---
You received this message because you are subscribed to the Google Groups
"ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/a/isocpp.org/group/std-
discussion/.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Nicol Bolas

2017-01-18 19:54:26 UTC

Post by Robert Haberlach
Are you stoned?

No, but if you ever visit Spain I can show you where you may become so
(quite legally) if you wish.
What is your problem with memory being... memory?
on a system where ints are 32 bits, 32-bit words are addressable without
bitwise arithmetic, and the compiler deems that 128 bits is a reasonable
struct A {
int a;
int b[2];
};
struct B {
int a[2];
int b;
};
... both A and B occupy 128 bits. The value of the last 32 bits is
irrelevant.
There is no reason whatsoever (other than handwaving from the 'c++ is not
a low level language crowd') that they should not be a union of each other.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Richard Hodges

2017-01-18 20:10:06 UTC

Post by Nicol Bolas
I don't even know what you mean by "a union of each other",

But I think you do.

Post by Nicol Bolas
but that sounds very much like "let's pretend strict aliasing doesn't

exist". So not gonna happen.

Strict aliasing is another matter entirely. Clearly, writing to the same
memory through dissimilar pointers in the same loop gives optimisers more
problems when tracking aliasing. By all means let's disallow that. By all
means let's assume that a block of memory has only one "shape" at a time,
for a given logical operation on it.

And when a user chooses to say something different, by casting a pointer or
reference to that memory, let's flush any pending writes at that point.
Logically there is no difference between casting POD pointers pointing to
the same memory, and changing the element addressed in a union.

Even if there were, imbuing a POD with extern "C" should *still* make it
behave like C memory, because that's obviously what it is.

Post by Robert Haberlach
Are you stoned?

No, but if you ever visit Spain I can show you where you may become so
(quite legally) if you wish.
What is your problem with memory being... memory?
on a system where ints are 32 bits, 32-bit words are addressable without
bitwise arithmetic, and the compiler deems that 128 bits is a reasonable
struct A {
int a;
int b[2];
};
struct B {
int a[2];
int b;
};
... both A and B occupy 128 bits. The value of the last 32 bits is
irrelevant.
There is no reason whatsoever (other than handwaving from the 'c++ is not
a low level language crowd') that they should not be a union of each other.

C++ is a low level langauge. What C++ *isn't* is a language that pretends
that compilers don't get to make choices about how things work.
The compiler gets to decide how to lay out both of those structures. And I
see no reason why the compiler should be *required* to decide to lay them
out identically.
I don't even know what you mean by "a union of each other", but that
sounds very much like "let's pretend strict aliasing doesn't exist". So not
gonna happen.
--
---
You received this message because you are subscribed to the Google Groups
"ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/a/isocpp.org/group/std-
discussion/.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Hyman Rosen

2017-01-18 22:00:08 UTC

I would prefer if the Standard said something like this:

* For any trivially copyable type T, a suitably-aligned region of
storage with size sizeof(T) is said to contain an object of type T if the
region contains the value representation of a value of T. If a region of
storage contains an object of type T, a valid pointer to an object of type
cv T* may be obtained by applying static_cast<cv T*> to a pointer to the
address of the region. The cv qualifiers on the pointer and cast must be
at least as strict as those on the region of storage. If the storage of
an object of union type contains an object of the type of a member, that
member is active.*

Post by Nicol Bolas
I don't even know what you mean by "a union of each other",

But I think you do.

Post by Nicol Bolas
but that sounds very much like "let's pretend strict aliasing doesn't

exist". So not gonna happen.
Strict aliasing is another matter entirely. Clearly, writing to the same
memory through dissimilar pointers in the same loop gives optimisers more
problems when tracking aliasing. By all means let's disallow that. By all
means let's assume that a block of memory has only one "shape" at a time,
for a given logical operation on it.
And when a user chooses to say something different, by casting a pointer
or reference to that memory, let's flush any pending writes at that point.
Logically there is no difference between casting POD pointers pointing to
the same memory, and changing the element addressed in a union.
Even if there were, imbuing a POD with extern "C" should *still* make it
behave like C memory, because that's obviously what it is.

Post by Robert Haberlach
Are you stoned?

No, but if you ever visit Spain I can show you where you may become so
(quite legally) if you wish.
What is your problem with memory being... memory?
on a system where ints are 32 bits, 32-bit words are addressable without
bitwise arithmetic, and the compiler deems that 128 bits is a reasonable
struct A {
int a;
int b[2];
};
struct B {
int a[2];
int b;
};
... both A and B occupy 128 bits. The value of the last 32 bits is
irrelevant.
There is no reason whatsoever (other than handwaving from the 'c++ is
not a low level language crowd') that they should not be a union of each
other.

C++ is a low level langauge. What C++ *isn't* is a language that
pretends that compilers don't get to make choices about how things work.
The compiler gets to decide how to lay out both of those structures. And
I see no reason why the compiler should be *required* to decide to lay
them out identically.
I don't even know what you mean by "a union of each other", but that
sounds very much like "let's pretend strict aliasing doesn't exist". So not
gonna happen.
--
---
You received this message because you are subscribed to the Google Groups
"ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/a/is
ocpp.org/group/std-discussion/.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Hyman Rosen

2017-01-18 22:06:27 UTC

Given my wording and struct A { int a[2]; int b; }; struct B { int a; int
b[2]; };, it would be the case that any region of storage containing an A
would also contain a B. It's in that sense that A and B are "unions of
each other".

Post by Hyman Rosen
* For any trivially copyable type T, a suitably-aligned region of
storage with size sizeof(T) is said to contain an object of type T if the
region contains the value representation of a value of T. If a region of
storage contains an object of type T, a valid pointer to an object of type
cv T* may be obtained by applying static_cast<cv T*> to a pointer to the
address of the region. The cv qualifiers on the pointer and cast must be
at least as strict as those on the region of storage. If the storage of
an object of union type contains an object of the type of a member, that
member is active.*

Post by Nicol Bolas
I don't even know what you mean by "a union of each other",

But I think you do.

Post by Nicol Bolas
but that sounds very much like "let's pretend strict aliasing doesn't

exist". So not gonna happen.
Strict aliasing is another matter entirely. Clearly, writing to the same
memory through dissimilar pointers in the same loop gives optimisers more
problems when tracking aliasing. By all means let's disallow that. By all
means let's assume that a block of memory has only one "shape" at a time,
for a given logical operation on it.
And when a user chooses to say something different, by casting a pointer
or reference to that memory, let's flush any pending writes at that point.
Logically there is no difference between casting POD pointers pointing to
the same memory, and changing the element addressed in a union.
Even if there were, imbuing a POD with extern "C" should *still* make it
behave like C memory, because that's obviously what it is.

Post by Robert Haberlach
Are you stoned?

No, but if you ever visit Spain I can show you where you may become so
(quite legally) if you wish.
What is your problem with memory being... memory?
on a system where ints are 32 bits, 32-bit words are addressable
without bitwise arithmetic, and the compiler deems that 128 bits is a
struct A {
int a;
int b[2];
};
struct B {
int a[2];
int b;
};
... both A and B occupy 128 bits. The value of the last 32 bits is
irrelevant.
There is no reason whatsoever (other than handwaving from the 'c++ is
not a low level language crowd') that they should not be a union of each
other.

C++ is a low level langauge. What C++ *isn't* is a language that
pretends that compilers don't get to make choices about how things work.
The compiler gets to decide how to lay out both of those structures. And
I see no reason why the compiler should be *required* to decide to lay
them out identically.
I don't even know what you mean by "a union of each other", but that
sounds very much like "let's pretend strict aliasing doesn't exist". So not
gonna happen.
--
---
You received this message because you are subscribed to the Google
Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send
Visit this group at https://groups.google.com/a/is
ocpp.org/group/std-discussion/.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Hyman Rosen

2017-01-18 22:09:13 UTC

I mis-worded slightly:

* For any trivially copyable type T, a suitably-aligned region of
storage with size sizeof(T) is said to contain an object of type T if the
region contains the value representation of a value of T. If a region of
storage contains an object of type T, a valid pointer to an object of type
cv T* may be obtained by applying static_cast<cv T*> to the address of the
region. The cv qualifiers on the pointer and cast must be at least as
strict as those on the region of storage. If the storage of an object of
union type contains an object of the type of a member, that member is
active.*

Post by Hyman Rosen
Given my wording and struct A { int a[2]; int b; }; struct B { int a; int
b[2]; };, it would be the case that any region of storage containing an A
would also contain a B. It's in that sense that A and B are "unions of
each other".

Post by Nicol Bolas
I don't even know what you mean by "a union of each other",

But I think you do.

Post by Nicol Bolas
but that sounds very much like "let's pretend strict aliasing doesn't

exist". So not gonna happen.
Strict aliasing is another matter entirely. Clearly, writing to the same
memory through dissimilar pointers in the same loop gives optimisers more
problems when tracking aliasing. By all means let's disallow that. By all
means let's assume that a block of memory has only one "shape" at a time,
for a given logical operation on it.
And when a user chooses to say something different, by casting a pointer
or reference to that memory, let's flush any pending writes at that point.
Logically there is no difference between casting POD pointers pointing to
the same memory, and changing the element addressed in a union.
Even if there were, imbuing a POD with extern "C" should *still* make
it behave like C memory, because that's obviously what it is.

Post by Robert Haberlach
Are you stoned?

No, but if you ever visit Spain I can show you where you may become so
(quite legally) if you wish.
What is your problem with memory being... memory?
on a system where ints are 32 bits, 32-bit words are addressable
without bitwise arithmetic, and the compiler deems that 128 bits is a
struct A {
int a;
int b[2];
};
struct B {
int a[2];
int b;
};
... both A and B occupy 128 bits. The value of the last 32 bits is
irrelevant.
There is no reason whatsoever (other than handwaving from the 'c++ is
not a low level language crowd') that they should not be a union of each
other.

C++ is a low level langauge. What C++ *isn't* is a language that
pretends that compilers don't get to make choices about how things work.
The compiler gets to decide how to lay out both of those structures.
And I see no reason why the compiler should be *required* to decide to
lay them out identically.
I don't even know what you mean by "a union of each other", but that
sounds very much like "let's pretend strict aliasing doesn't exist". So not
gonna happen.
--
---
You received this message because you are subscribed to the Google
Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send
Visit this group at https://groups.google.com/a/is
ocpp.org/group/std-discussion/.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Hyman Rosen

2017-01-18 22:33:03 UTC

And lets make it
*...with size at least sizeof(T)...*
so that we can speak of the region of storage of a union containing a T.

Post by Hyman Rosen
* For any trivially copyable type T, a suitably-aligned region of
storage with size sizeof(T) is said to contain an object of type T if the
region contains the value representation of a value of T. If a region of
storage contains an object of type T, a valid pointer to an object of type
cv T* may be obtained by applying static_cast<cv T*> to the address of the
region. The cv qualifiers on the pointer and cast must be at least as
strict as those on the region of storage. If the storage of an object of
union type contains an object of the type of a member, that member is
active.*

Post by Hyman Rosen
Given my wording and struct A { int a[2]; int b; }; struct B { int a;
int b[2]; };, it would be the case that any region of storage containing
an A would also contain a B. It's in that sense that A and B are
"unions of each other".

Post by Nicol Bolas
I don't even know what you mean by "a union of each other",

But I think you do.

Post by Nicol Bolas
but that sounds very much like "let's pretend strict aliasing doesn't

exist". So not gonna happen.
Strict aliasing is another matter entirely. Clearly, writing to the
same memory through dissimilar pointers in the same loop gives optimisers
more problems when tracking aliasing. By all means let's disallow that. By
all means let's assume that a block of memory has only one "shape" at a
time, for a given logical operation on it.
And when a user chooses to say something different, by casting a
pointer or reference to that memory, let's flush any pending writes at that
point. Logically there is no difference between casting POD pointers
pointing to the same memory, and changing the element addressed in a union.
Even if there were, imbuing a POD with extern "C" should *still* make
it behave like C memory, because that's obviously what it is.

Post by Robert Haberlach
Are you stoned?

No, but if you ever visit Spain I can show you where you may become
so (quite legally) if you wish.
What is your problem with memory being... memory?
on a system where ints are 32 bits, 32-bit words are addressable
without bitwise arithmetic, and the compiler deems that 128 bits is a
struct A {
int a;
int b[2];
};
struct B {
int a[2];
int b;
};
... both A and B occupy 128 bits. The value of the last 32 bits is
irrelevant.
There is no reason whatsoever (other than handwaving from the 'c++ is
not a low level language crowd') that they should not be a union of each
other.

C++ is a low level langauge. What C++ *isn't* is a language that
pretends that compilers don't get to make choices about how things work.
The compiler gets to decide how to lay out both of those structures.
And I see no reason why the compiler should be *required* to decide
to lay them out identically.
I don't even know what you mean by "a union of each other", but that
sounds very much like "let's pretend strict aliasing doesn't exist". So not
gonna happen.
--
---
You received this message because you are subscribed to the Google
Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send
Visit this group at https://groups.google.com/a/is
ocpp.org/group/std-discussion/.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Hyman Rosen

2017-01-18 22:41:20 UTC

Grr... I'll get the wording right sooner or later...

*For any trivially copyable type T, a suitably-aligned region of storage
with size at least sizeof(T) is said to contain an object of type T if the
initial sizeof(T) part of the region contains the value representation of a
value of T. If a region of storage contains an object of type T, a valid
pointer to an object of type cv T* may be obtained by applying
static_cast<cv T*> to the address of the region. The cv qualifiers on the
pointer and cast must be at least as strict as those on the region of
storage. If the storage of an object of union type contains an object of
the type of a member, that member is active.*

Post by Hyman Rosen
And lets make it
*...with size at least sizeof(T)...*
so that we can speak of the region of storage of a union containing a T.

Post by Hyman Rosen
* For any trivially copyable type T, a suitably-aligned region of
storage with size sizeof(T) is said to contain an object of type T if the
region contains the value representation of a value of T. If a region of
storage contains an object of type T, a valid pointer to an object of type
cv T* may be obtained by applying static_cast<cv T*> to the address of the
region. The cv qualifiers on the pointer and cast must be at least as
strict as those on the region of storage. If the storage of an object of
union type contains an object of the type of a member, that member is
active.*

Post by Hyman Rosen
Given my wording and struct A { int a[2]; int b; }; struct B { int a;
int b[2]; };, it would be the case that any region of storage
containing an A would also contain a B. It's in that sense that A and B
are "unions of each other".

Post by Nicol Bolas
I don't even know what you mean by "a union of each other",

But I think you do.

Post by Nicol Bolas
but that sounds very much like "let's pretend strict aliasing

doesn't exist". So not gonna happen.
Strict aliasing is another matter entirely. Clearly, writing to the
same memory through dissimilar pointers in the same loop gives optimisers
more problems when tracking aliasing. By all means let's disallow that. By
all means let's assume that a block of memory has only one "shape" at a
time, for a given logical operation on it.
And when a user chooses to say something different, by casting a
pointer or reference to that memory, let's flush any pending writes at that
point. Logically there is no difference between casting POD pointers
pointing to the same memory, and changing the element addressed in a union.
Even if there were, imbuing a POD with extern "C" should *still* make
it behave like C memory, because that's obviously what it is.

Post by Robert Haberlach
Are you stoned?

No, but if you ever visit Spain I can show you where you may become
so (quite legally) if you wish.
What is your problem with memory being... memory?
on a system where ints are 32 bits, 32-bit words are addressable
without bitwise arithmetic, and the compiler deems that 128 bits is a
struct A {
int a;
int b[2];
};
struct B {
int a[2];
int b;
};
... both A and B occupy 128 bits. The value of the last 32 bits is
irrelevant.
There is no reason whatsoever (other than handwaving from the 'c++
is not a low level language crowd') that they should not be a union of each
other.

C++ is a low level langauge. What C++ *isn't* is a language that
pretends that compilers don't get to make choices about how things work.
The compiler gets to decide how to lay out both of those structures.
And I see no reason why the compiler should be *required* to decide
to lay them out identically.
I don't even know what you mean by "a union of each other", but that
sounds very much like "let's pretend strict aliasing doesn't exist". So not
gonna happen.
--
---
You received this message because you are subscribed to the Google
Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it,
Visit this group at https://groups.google.com/a/is
ocpp.org/group/std-discussion/.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Nicol Bolas

2017-01-18 17:19:51 UTC

Post by Nicol Bolas
Essentially yes, but there's more to it than that.

Only if you enjoy complicating your life.

Post by Nicol Bolas
The problem basically boils down to this: C++ makes C-isms undefined

You keep saying that as though it were some objective fact rather than a
choice.

To a compiler, there's no such thing as an "object"; memory is just *memory*.
It contains values. And so forth.

The concept of "object" only exists at the level of the standard. And thus,
an object can be whatever we choose for it to be.

If trivial types are not objects, then what are they? And how would that
work with code that acts on objects?

This makes the C++ behaviour the same as C behaviour when intrinsics and

Post by Richard Hodges
PODS are mapped onto memory. It's logical, everyone does it anyway, and
it's never going away in gcc or clang. End of problem. Lets get on with
something new.

Post by Nicol Bolas
I have a different solution. Instead of promoting garbage C-isms like

pointer casting and so forth, we make C++ equivalents. Placement `new` is
one such C++-ism which allows the creation of C++ objects in arbitrary
memory. But we can add many more.
NO - because that just adds more useless work for programmers. It's the
reverse of what *auto* does - which is make life easier and better.
Useless work like having to formally introduce storage (which is what
you're suggesting)

No, I'm not. What I'm saying is that *casting* should not be something that
is encouraged. If you want to perform a certain operation, you should
perform *that operation*, not fiddle around with what type a pointer points
to or other such nonsense.

Consider the following:

auto ptr = malloc(4);
autp i_ptr = static_cast<int*>(ptr);
*i_ptr = 5;

To a programmer who has never seen C-isms before, this looks like nonsense.
You get some memory, then pretend that it stores an `int`? How? Why? By
contrast:

auto ptr = malloc(4);
auto i_ptr = new(ptr) int;
*i_ptr = 5;

This is sane code. It clearly allocates memory and creates an `int`. It
then accesses it.

Similarly:

Type t;
memcpy(&t, some_ptr, sizeof(T);

Is oddball. By contrast:

auto t = std::trivial_copy_construct<T>(some_ptr);

Is far more reasonable. You're clearly constructing a `T` from memory.

Post by Richard Hodges
is what COBOL and Pascal did. They're dead now. Let's not do that.

Post by Nicol Bolas
If people need a way to take memory that has been filled in from

No, it isn't.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Richard Hodges

2017-01-18 18:41:21 UTC

Post by Nicol Bolas
You keep saying that as though it were some objective fact rather than a

choice.

The fact that I can cast properly aligned storage to a POD and use it as a
POD:

a) is a de-facto reality and always will be
b) is necessary to allow c++ to interact with every computer system in the
world.
c) should therefore obviously be mandated as true in the standard.

Post by Nicol Bolas
The concept of "object" only exists at the level of the standard. And

thus, an object can be whatever we choose for it to be.

Good. We agree on that. Let's choose for for an 'object' to be something
more complex than a BASIC POD. Let's define a BASIC POD as being a POD with
only defaulted special functions. Lets also choose that a pointer to BASIC
POD is a template through which we manipulate memory (subject to the
as-if-rule). Lets also choose that any sufficiently aligned and sized
memory block can be viewed in a defined way through a pointer to a BASIC
POD.

Now, if we choose to overlay a BASIC POD half-way into some other object,
then OBVIOUSLY, access to that other object is undefined. But the BASIC POD
is not.

Why?
a) Because this is reality and,
b) It's necessary and,
c) it solves your pet problem - implementing a vector correctly.

Lets also allow a BASIC POD to have its last member as a Zero-sized array.
Such an array may be validly accessed provided there is storage behind it -
because this allows us to create really useful things like buffers that the
average programmer can understand.

Further. Lets further legalise pointer arithmetic.

Finally, let's stop trying to pretend that memory is some nebulous thing.
It's memory. Sometimes C++ needs to go low level and it's useful for it to
be high level. Let's keep the versatility. gcc's optimiser copes with that,
so does clang's. There is no start writing doublespeak in the standard
about it not being true. It is true.

Post by Nicol Bolas
What I'm saying is that *casting* should not be something that is

encouraged.

Casting cannot be avoided when you interface with C libraries. Every
production c++ program interfaces with C (and sometimes objective-C)
libraries. Therefore, casting cannot be avoided. Interacting with C's
"BASIC PODS" cannot be avoided. Therefore it must not be undefined. If
nothing else, this will prevent every 20th post on stackoverflow from being
howls of outrage that being forced to write memcpy, only for the copy to be
thrown away.

Let me put this another way:

this code:

std::memcpy(&myints, your_chars, sizeof(int) * 10);

currently signals to the compiler that your_chars are really an array of 10
ints.

so should this:

struct F {
int n;
int a[]
};

extern "C" {
F* makeF();
}

auto pf = makeF();

// pf->n should be valid AND pf->a[pf->n-1] should be valid when (pf->n > 0)

and this:

int* pint = (int*)your_chars;

in which case *(pint + 6) should be mandated as valid *if* there is storage
behind the address.

If you want to treat c++ as a high level language, no problem. Someone else
can write the wrapper for you. But the wrapper should be able to be
*standard-compliant within
the C++ language*

Post by Nicol Bolas
Essentially yes, but there's more to it than that.

Only if you enjoy complicating your life.

Post by Nicol Bolas
The problem basically boils down to this: C++ makes C-isms undefined

You keep saying that as though it were some objective fact rather than a
choice.
To a compiler, there's no such thing as an "object"; memory is just
*memory*. It contains values. And so forth.
The concept of "object" only exists at the level of the standard. And
thus, an object can be whatever we choose for it to be.
If trivial types are not objects, then what are they? And how would that
work with code that acts on objects?
This makes the C++ behaviour the same as C behaviour when intrinsics and

Post by Richard Hodges
PODS are mapped onto memory. It's logical, everyone does it anyway, and
it's never going away in gcc or clang. End of problem. Lets get on with
something new.

Post by Nicol Bolas
I have a different solution. Instead of promoting garbage C-isms like

pointer casting and so forth, we make C++ equivalents. Placement `new` is
one such C++-ism which allows the creation of C++ objects in arbitrary
memory. But we can add many more.
NO - because that just adds more useless work for programmers. It's the
reverse of what *auto* does - which is make life easier and better.
Useless work like having to formally introduce storage (which is what
you're suggesting)

No, I'm not. What I'm saying is that *casting* should not be something
that is encouraged. If you want to perform a certain operation, you should
perform *that operation*, not fiddle around with what type a pointer
points to or other such nonsense.
auto ptr = malloc(4);
autp i_ptr = static_cast<int*>(ptr);
*i_ptr = 5;
To a programmer who has never seen C-isms before, this looks like
nonsense. You get some memory, then pretend that it stores an `int`? How?
auto ptr = malloc(4);
auto i_ptr = new(ptr) int;
*i_ptr = 5;
This is sane code. It clearly allocates memory and creates an `int`. It
then accesses it.
Type t;
memcpy(&t, some_ptr, sizeof(T);
auto t = std::trivial_copy_construct<T>(some_ptr);
Is far more reasonable. You're clearly constructing a `T` from memory.

Post by Richard Hodges
is what COBOL and Pascal did. They're dead now. Let's not do that.

Post by Nicol Bolas
If people need a way to take memory that has been filled in from

No, it isn't.
--
---
You received this message because you are subscribed to the Google Groups
"ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/a/isocpp.org/group/std-
discussion/.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Nicol Bolas

2017-01-18 19:43:52 UTC

Post by Nicol Bolas
You keep saying that as though it were some objective fact rather than

a choice.
The fact that I can cast properly aligned storage to a POD and use it as a
a) is a de-facto reality and always will be

Sure. But that doesn't mean we should *encourage* it.

b) is necessary to allow c++ to interact with every computer system in the

world.

No, it is not necessary at all. We can develop C++ tools that don't involve
casting which can achieve the same effect.

The mechanism isn't important; the *effect* is what matters. I don't see
any problems with C++ using different mechanisms to achieve the effects you
would use in C. As long as we can get equivalent functionality, there's no
need to adopt C-isms.

c) should therefore obviously be mandated as true in the standard.

Post by Nicol Bolas
The concept of "object" only exists at the level of the standard. And

OK, let's say we do that.

Are you prepared to rewrite the *entire standard* to that effect? Before,
the standard could use the term "object"; when it did, that part of the
standard would apply to a type as simple as `int` or one as complex as
`complex_polymorphic_cpp_type`. `new int` returns a newly allocated `int`
object. `new complex_polymorphic_cpp_type` would likewise return a newly
allocated object of that type. So how do you define what it does now?

Now, you have to come up with some new term to represent both cases,
because there are lots of parts of the standard that cover both. You then
have to change it everywhere, except for those cases where you need to
differentiate them.

And that's just *one aspect* of your suggested change.

It's easy to tell someone else to do the hard work, isn't it? It's easy to
say "take this building down and build a new one." It's a lot harder when
it's just you with some concrete and a sledgehammer who has to do the
actual work.

If it's so easy, then do it. Put together a proposal. Not a couple of
paragraphs, but an *actual, firm, real proposal* (one that perhaps doesn't
use ALL CAPS as much as you do here. Seriously, why does BASIC need to be
capitalized?). One with at least a good-faith attempt at standards wording.
One that results in a coherent object model.

Now, if we choose to overlay a BASIC POD half-way into some other object,

then OBVIOUSLY, access to that other object is undefined. But the BASIC POD
is not.
Why?
a) Because this is reality and,
b) It's necessary and,
c) it solves your pet problem - implementing a vector correctly.

It doesn't solve my "pet problem". Your change only affects non-"object"s;
`vector<T>` needs to work whether `T` is an "object" type or a "BASIC POD"
type.

Lets also allow a BASIC POD to have its last member as a Zero-sized array.

Such an array may be validly accessed provided there is storage behind it -
because this allows us to create really useful things like buffers that the
average programmer can understand.
Further. Lets further legalise pointer arithmetic.

Pointer arithmetic is legal within arrays, which is fine. The problem is
with things that are array-like but aren't technically arrays.

Suggesting that we fundamentally re-architect the entire object model just
to make `vector` work is like using a nuke on a mountain, then *rebuilding*
that mountain with a tunnel in it. It'd be a lot easier and less
radioactive to just drill a tunnel.

Finally, let's stop trying to pretend that memory is some nebulous thing.

It's memory. Sometimes C++ needs to go low level and it's useful for it to
be high level. Let's keep the versatility. gcc's optimiser copes with that,
so does clang's. There is no start writing doublespeak in the standard
about it not being true. It is true.

Post by Nicol Bolas
What I'm saying is that *casting* should not be something that is

Does it? Would `memcpy(&myints, your_chars, 4 * 10)` provide the same
signal? How does the compiler know that it's an array of 10 ints and not of
10 floats? Or 20 shorts? Or 40 chars? Or whatever?

I know what would *actually* signal copying 10 ints:

std::trivial_copy_assign_strict<int>(&myints, your_chars, 10);

struct F {
int n;
int a[]
};

That's not legal C++.

extern "C" {
F* makeF();
}
auto pf = makeF();
// pf->n should be valid AND pf->a[pf->n-1] should be valid when (pf->n > 0)
int* pint = (int*)your_chars;
in which case *(pint + 6) should be mandated as valid *if* there is
storage behind the address.
If you want to treat c++ as a high level language, no problem. Someone
else can write the wrapper for you. But the wrapper should be able to be *standard-compliant within
the C++ language*

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.

Richard Hodges

2017-01-18 19:58:53 UTC

Post by Nicol Bolas
Sure. But that doesn't mean we should *encourage* it.

enablement is not encouragement.

Post by Nicol Bolas
No, it is not necessary at all. We can develop C++ tools that don't

involve casting which can achieve the same effect.

Not standards compliant ones, you can't. And no-one has. Otherwise, please
show me one. In the face of reality, you're quoting me hypotheses that will
never see the light of day. Your'e wasting our time.

Post by Nicol Bolas
OK, let's say we do that. Are you prepared to rewrite the *entire

standard* to that effect?

Good. Yes, absolutely. The current standard is garbage, because it tells
lies. Like an interface that lies, that is worse than useless. Fix it.

Post by Nicol Bolas
If it's so easy, then do it. Put together a proposal. Not a couple of

paragraphs, but an *actual, firm, real proposal* (one that perhaps doesn't
use ALL CAPS as much as you do here. Seriously, why does BASIC need to be
capitalized?). One with at least a good-faith attempt at standards wording.
One that results in a coherent object model.

I pretty much just did. But sure. If you can guarantee that it will get in
front of the top guy, and I'll get a chance to argue common sense and the
position of the majority of the user base in person, no problem. Like any
organisation, speaking to anyone other than the decision maker wastes
everyone's time. Who is the decision-maker here, or the person to whose
opinion everyone on the committee will submit? Show me that and I will
teach you how it's done.

Post by Nicol Bolas
It doesn't solve my "pet problem". Your change only affects

non-"object"s; `vector<T>` needs to work whether `T` is an "object" type or
a "BASIC POD" type.

If vector<T> can't be implementing in c++, it's because you've broken c++.
Let's repair the damage before you guys make it worse.

Post by Nicol Bolas
Pointer arithmetic is legal within arrays, which is fine. The problem is

with things that are array-like but aren't technically arrays.

Pointers are index registers. End of story. If you add the size of an
object to one, you get the address of the adjacent object. A 3-year-old can
understand this. <sigh>

Post by Nicol Bolas
Suggesting that we fundamentally re-architect the entire object model

just to make `vector` work is like using a nuke on a mountain, then
*rebuilding* that mountain with a tunnel in it. It'd be a lot easier and
less radioactive to just drill a tunnel.

It's not just vector. The entire edifice upon which c++ depends
(interaction with C) is broken. I say we stop whinging and fix it.

Post by Nicol Bolas
std::trivial_copy_assign_strict<int>

Sure, let's type 40 cryptic characters where none would do.

Post by Nicol Bolas
That's not legal C++.

And yet it works, is expressive and necessary if you want to describe
variable length buffers, which are incredibly useful, without resorting to
casts.

I thought you were against casts?.

Stop arguing the unarguable. The standard is broken. Let's fix it.