[std-proposals] One variable init form to rule them all, via mandatory elision

Discussion:

Mark A. Gibbs

2015-09-09 23:27:17 UTC

Scott Meyers's recent post on initialization has brought up a long-time pet
peeve of mine: defining new objects. My beef is not that there are multiple
ways to do it, it's that there is no single way that Just Works(tm)
everywhere, and does the right thing, with robust syntax. The "Almost
Always auto" style proposed by Herb Sutter comes close, but there are some
gotchas with it. It seems to me there is a very simple way to remove those
gotchas, but I've never seen it proposed.

I'm going to start by explaining why I've chosen AAA as the basis for this
proposal. If you're already down with AAA, you can skip to the tl;dr at the
bottom.

Meyers enumerates 4 different ways to declare and initialize a new
variable. With "type" being either a type name or "auto", and "val" being a
lvalue of type "type", they are:
type v = val; // #1
type v(val); // #2
type v = {val}; // #3
type v{val}; // #4

Let's just write #2 off immediately, because of the most vexing arse -
pardon the typo.

#4 is deeply problematic, because the behaviour has changed between C++14
and C++17. So let's drop that, too.

#3 has different behaviour whether "type" is an actually a type name or "
auto". So let's put that aside for the moment.

That leaves us with:
type v = val;

Which, using int, can be either:
int v = 0; // (a)
auto v = 0; // (b)

(a) is fine, but somewhat redundant and occasionally dangerous - if the
type isn't the same on the left and right, you may end up with a conversion
that may be expensive, or may lose information.

So that brings us down to (b), and the whole train of logic above turns to
be a reiteration of the "Almost Always auto" (AAA) argument. For those who
aren't familiar, the AAA argument is that variables should always be
defined using the pattern:
auto v = /* initial value */;

The benefits of this include:

- No unexpected conversions.
- Impossible to have an uninitialized variable.
- Consistent with other modern constructs, easy to read, hard to
misunderstand.

Generally speaking, if you already have a value and you just want to
initialize a copy or move it, you can just use:
auto v = val;

But if you don't have a value, or if you do but you want to be explicit
about the type, you can use:
auto v = type{val}; // or any other constructor arguments, including none

In cases where you *REALLY* want that type, and you don't care if you're
narrowing - or when you *really* want to use a specific constructor, and
the arguments are not just a initializer list of values - then you can use:
auto v = type(val); // Most vexing parse not a problem

The C++ standard allows compilers to elide the copy construction and move
assignment (really a move construction) to a single construction... *even
though it may produce different observable behaviour*. It is (AFAIK) the
only optimization the standard allows that can do that. That means that:
auto v = val; // a single copy construction
auto v = type{val}; // can be a single copy construction
auto v = type(val); // can be a single copy construction

And all modern compilers worth mentioning do the elision.

This also means that:
auto v = type{}; // can be just default construction

So it seems we have a winner for the one, true variable definition format.

But not so fast....

As Meyers noted, this pattern has one case where it doesn't work: types
that are non-copyable and non-movable.

The reasoning that even though the standard *allows* the move (or copy)
construction to be elided, it doesn't *require* it. And compilers have to
pretend they're still going to do it, even though they're not. That is why
this won't compile:
auto v = std::atomic<int>{};

But this restriction is silly and pedantic. There is no way the line above
can reasonably be interpreted as anything else but "I want v to be a
default constructed std::atomic<int>".

And if anyone wants to try to argue that someone might actually want to do
a default construction then move construction when writing that, I call
shenanigans. Because that's not what they're going to get - in any compiler
of note - and the standard blesses this.

Put yourself in the shoes of a C++ newbie who's just learning the language.
Wanting to explore constructors and destructors, ze writes this simple
class:

struct foo
{
foo() { std::cout << "constructing foo\n"; }
~foo() { std::cout << "destructing foo\n"; }

foo(foo const&) = delete;
foo& operator=(foo const&) = delete;

foo(foo&&) = delete;
foo& operator=(foo&&) = delete;
};

int main()
{
foo f;
}

// Output:
// constructing foo
// destructing foo

So far, no problems. But then ze decides to try using the AAA pattern to
define "f":

int main()
{
auto f = foo{};
}

This fails to compile, and the compiler explains that it's because "foo" is
non-movable and non-copyable. "Ah!" says our newbie, thinking ze's learned
something useful. "Well this makes perfect sense! I suppose I'm doing a
construction on the right, and then an assignment to the variable on the
left. Well played C++. Well, now let me experiment to see which special
member function gets used to do the transfer from the right side to the
left."

So now ze writes this:

struct foo
{
foo() { std::cout << "constructing foo\n"; }
~foo() { std::cout << "destructing foo\n"; }

foo(foo const&) { std::cout << "copy constructing foo\n"; }
foo& operator=(foo const&) { std::cout << "copy assigning foo\n"; return *
this; }

foo(foo&&) { std::cout << "move constructing foo\n"; }
foo& operator=(foo&&) { std::cout << "move assigning foo\n"; return *this;
}
};

int main()
{
auto f = foo{};
}

And how is our diligent newbie rewarded for zes experimentation? With this:

// Output:
// constructing foo
// destructing foo

"What... what the flagnar? None of the special functions are used? How can
this be? The compiler said I needed to add them! Why would the compiler
insist on having them if it never uses them! Is this a compiler bug? No?
This absurdity is standard behaviour? To hell with C++. Bring me a Ruby
book."

The cause of all this confusion is the requirement that statements of the
form "type v = type{/*params*/};" or "auto v = type{/*params*/};" (or with
parentheses rather than braces in both cases) *MUST* be interpreted as a
construction on the right, then *another* (move/copy) construction to the
variable on the left (and then a destruction of the temporary)... even
though at the same time compilers are free to elide it all down to a single
construction.

What if, instead, statements of the form "type v = type{/*params*/};" or "auto
v = type{/*params*/};" were interpreted as a single construction, of the
variable "v" which has type "type"?

This would only apply to any statement of the form "XXX varname =
YYY{/*arguments*/};" or "XXX varname = YYY(/*arguments*/);", provided that "
XXX" and "YYY" deduce to the same type (which is obviously true if "XXX" is
"auto"). It would not apply to, for example "float v = int{42};". It *would*
apply to "using number = int; number v = int{42};". It would be exactly
equivalent to "YYY varname{/*arguments*/};" or "YYY varname(/*arguments*/);"
(except, in the latter case, it will not be subject to the most vexing
parse when that applies).

For backwards compatibility, when "type" is movable or copyable, compilers
still have the leeway to implement it as a construction then a move
construction. I don't know of any compilers that actually do that, but,
just in case. That means that the meaning of any code that compiles today
will not change. The only thing that will change is the code that currently
doesn't compile today - when "type" is non-movable and non-copyable - will
now compile (and obviously will only be interpreted as a single
construction, because it can't possibly be interpreted as a construction
then move/copy). I can't imagine this adds any complexity to compilers, and
in fact *removes* a completely unnecessary check that they do today (the
check for move/copy constructability).

The result is we would get a way to define initialized variables that works
consistently in every case, is fairly robust against mistakes and typos,
and has obvious and logical semantics:

int i = 0; // an int
int i = int{0}; // identical to first line (does narrowing check)
int i = int(0); // identical to first line (no narrowing check)
auto i = 0; // identical to first line
auto i = int{0}; // identical to first line (does narrowing check)
auto i = int(0); // identical to first line (no narrowing check)

int i = int{}; // default construction
auto i = int{}; // identical to above

// Same semantics with types with UDLs
std::string s{"foo"}; // std::string
auto s = std::string{"foo"}; // identical to first line
auto s = "foo"s; // using a UDL, (effectively) identical

std::string s = std::string{}; // default construction
auto s = std::string{}; // identical to above

// Non-copyable, non-movable types, too
std::atomic<int> i = 0; // an std::atomic<int>
std::atomic<int> i = std::atomic<int>{0}; // identical to first line
std::atomic<int> i = std::atomic<int>(0); // identical to first line
auto i = std::atomic<int>{0}; // identical to first line
auto i = std::atomic<int>(0); // identical to first line

std::atomic<int> i = std::atomic<int>{}; // default construction
auto i = std::atomic<int>{}; // identical to above

auto a = std::atomic<int>{42};
std::atomic<int> b = a; // won't compile (non-copyable)
auto b = a; // same
std::atomic<int> b = std::move(a); // won't compile (non-movable)
auto b = std::move(a); // same

auto func() -> std::atomic<int>;
std::atomic<int> b = func(); // won't compile (non-movable)
auto b = func(); // same
std::atomic<int> b = std::atomic<int>{func()}; // Won't compile, because
the
// construction on the right
// is invalid. Would work if
// func() return an int.
auto b = std::atomic<int>{func()}; // same

// Familiar behaviours (that already assume elision) are unchanged
std::vector<int> v = std::vector<int>{}; // default construction
auto v = std::vector<int>{}; // same

std::vector<int> v = std::vector<int>{2, 3}; // list construction { 2, 3 }
auto v = std::vector<int>{2, 3}; // same

std::vector<int> v = std::vector<int>(2, 3); // constructs vector { 3, 3 }
auto v = std::vector<int>(2, 3); // same

// If you stick with the policy to always use auto on the left,
// even the initializer list issue that gnaws on Meyers becomes clear:
auto i = 0; // i is int (what else would it be?)
auto i = {0}; // i is initializer_list<int> (what else would it be?)
auto i = int{0}; // i is int (what else would it be?)

// It only gets weird if you do:
int i = 0; // Clear
int i = {0}; // What? You want to assign an initializer_list<int> to an
int?
// Well, I guess it works if there's only one int in the
list...
int i = int{0}; // Clear, but redundant
// Plus all the headaches mentioned in N4014
// But that's a style issue.

So here's the tl;dr bullet point version:

- The AAA style is awesome.
- It has only one gotcha - it fails for non-copyable, non-movable types.
- The language already allows the move/copy to be elided for
copyable/moveable types, and all compilers do that. So the move/copy
requirement is unnecessary. Let's ditch it.
- Let all statements of the form "XXX v = YYY{/*args*/};" where "XXX"
resolves to the same type as "YYY" (which includes when "XXX" is "auto",
obviously), be exactly equivalent to "YYY v{/*args*/};".
- Let all statements of the form "XXX v = YYY(/*args*/);" where "XXX"
resolves to the same type as "YYY" (which includes when "XXX" is "auto",
obviously), be exactly equivalent to "YYY v(/*args*/);"... ignoring the
most vexing parse.
- For the sake of backward compatibility, give compilers the freedom -
when the type is copyable and/or movable - to interpret "XXX v =
YYY{/*args*/};" or "XXX v = YYY(/*args*/);" as a construction followed
by a move/copy construction. (None do, and none likely will, but just in
case. This behaviour could be deprecated, I suppose.)
- In other words, the currently standard-blessed optimization of eliding
the unnecessary temporary construction and move would become not an
optimization, but "the way it's done". Not eliding would become a
"tolerated pessimization".
- Won't change the meaning of any old code. Will make code that
currently won't compile - for silly, pedantic reasons - functional.

--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+***@isocpp.org.
To post to this group, send email to std-***@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

Ville Voutilainen

2015-09-09 23:34:15 UTC