Why are literals and temporary variables not lvalues?
There are a lot of common misconceptions in the question and in the other answers; my answer hopes to address that.
The terms lvalue and rvalue are expression categories. They are terms that apply to expressions. Not to objects. (A bit confusingly, the official term for expression categories is "value categories" ! )
The term temporary object refers to objects. This includes objects of class type, as well as objects of built-in type. The term temporary (used as a noun) is short for temporary object. Sometimes the standalone term value is used to refer to a temporary object of built-in type. These terms apply to objects, not to expressions.
The C++17 standard is more consistent in object terminology than past standards, e.g. see [conv.rval]/1. It now tries to avoid saying value other than in the context value of an expression.
Now, why are there different expression categories? A C++ program is made up of a collection of expressions, joined to each other with operators to make larger expressions; and fitting within a framework of declarative constructs. These expressions create, destroy, and do other manipulations on objects. Programming in C++ could be described as using expressions to perform operations with objects.
The reason that expression categories exist is to provide a framework for using expressions to express operations that the programmer intends. For example way back in the C days (and probably earlier), the language designers figured that 3 = 5;
did not make any sense as part of a program so it was decided to limit what sort of expression can appear on the left-hand side of =
, and have the compiler report an error if this restriction wasn't followed.
The term lvalue originated in those days, although now with the development of C++ there are a vast range of expressions and contexts where expression categories are useful, not just the left-hand side of an assignment operator.
Here is some valid C++ code: std::string("3") = std::string("5");
. This is conceptually no different from 3 = 5;
, however it is allowed. The effect is that a temporary object of type std::string
and content "3"
is created, and then that temporary object is modified to have content "5"
, and then the temporary object is destroyed. The language could have been designed so that the code 3 = 5;
specifies a similar series of events (but it wasn't).
Why is the string
example legal but the int
example not?
Every expression has to have a category. The category of an expression might not seem to have an obvious reason at first, but the designers of the language have given each expression a category according to what they think is a useful concept to express and what isn't.
It's been decided that the sequence of events in 3 = 5;
as described above is not something anyone would want to do, and if someone did write such a thing then they probably made a mistake and meant something else, so the compiler should help out by giving an error message.
Now, the same logic might conclude that std::string("3") = std::string("5")
is not something anyone would ever want to do either. However another argument is that for some other class type, T(foo) = x;
might actually be a worthwhile operation, e.g. because T
might have a destructor that does something. It was decided that banning this usage could be more harmful to a programmer's intentions than good. (Whether that was a good decision or not is debatable; see this question for discussion).
Now we are getting closer to finally address your question :)
Whether or not there is memory or a storage location associated is not the rationale for expression categories any more. In the abstract machine (more explanation of this below), every temporary object (this includes the one created by 3
in x = 3;
) exists in memory.
As described earlier in my answer, a program consists of expressions that manipulate objects. Each expression is said to designate or refer to an object.
It's very common for other answers or articles on this topic to make the incorrect claim that an rvalue can only designate a temporary object, or even worse , that an rvalue is a temporary object , or that a temporary object is an rvalue. An expression is not an object, it is something that occurs in source code for manipulating objects!
In fact a temporary object can be designated by an lvalue or an rvalue expression; and a non-temporary object can be designated by an lvalue or an rvalue expression. They are separate concepts.
Now, there's an expression category rule that you can't apply &
to an expression of the rvalue category. The purpose of this rule and these categories is to avoid errors where a temporary object is used after it is destroyed. For example:
int *p = &5; // not allowed due to category rules
*p = 6; // oops, dangling pointer
But you could get around this:
template<typename T> auto f(T&&t) -> T& { return t; }
// ...
int *p = f(5); // Allowed
*p = 6; // Oops, dangling pointer, no compiler error message.
In this latter code, f(5)
and *p
are both lvalues that designate a temporary object. This is a good example of why the expression category rules exist; by following the rules without a tricky workaround, then we would get an error for the code that tries to write through a dangling pointer.
Note that you can also use this f
to find the memory address of a temporary object, e.g. std::cout << &f(5);
In summary, the questions you actually ask all mistakenly conflate expressions with objects. So they are non-questions in that sense. Temporaries are not lvalues, because objects are not expressions.
A valid but related question would be: "Why is the expression that creates a temporary object an rvalue (as opposed to being an lvalue?)"
To which the answer is as was discussed above: having it be an lvalue would increase the risk of creating dangling pointers or dangling references; and as in 3 = 5;
, would increase the risk of specifying redundant operations that the programmer probably didn't intend.
I repeat again that the expression categories are a design decision to help with programmer expressiveness; not anything to do with memory or storage locations.
Finally, to the abstract machine and the as-if rule. C++ is defined in terms of an abstract machine, in which temporary objects have storage and addresses too. I gave an example earlier of how to print the address of a temporary object.
The as-if rule says that the output of the actual executable the compiler produces must only match the output that the abstract machine would. The executable doesn't actually have to work in the same way as the abstract machine, it just has to produce the same result.
So for code like x = 5;
, even though a temporary object of value 5
has a memory location in the abstract machine; the compiler doesn't have to allocate physical storage on the real machine. It only has to ensure that x
ends up having 5
stored in it and there are much easier ways to do this that don't involve extra storage being created.
The as-if rule applies to everything in the program, even though my example here only refers to temporary objects. A non-temporary object could equally well be optimized out, e.g. int x; int y = 5; x = y; // other code that doesn't use y
could be changed to int x = 5;
.
The same applies for class types without side-effects that would alter the program output. E.g. std::string x = "foo"; std::cout << x;
can be optimized to std::cout << "foo";
even though the lvalue x
denoted an object with storage in the abstract machine.
Why are literals and temporary variables not lvalues?
I have two answers: because it wouldn't make sense (1) and because the Standard says so (2). Let's focus on (1).
Is it because literals and temporaries variables do not have defined storage location?
This is a simplification that doesn't fit here. A simplification that would: literals and temporary are not lvalues because it wouldn't make sense to modify them1.
What is the meaning of 5++
? What is the meaning of rand() = 0
? The Standard says that temporaries and literals are not lvalues so those examples are invalid. And every compiler developer is happier.
1) You can define and use user-defined types in a way where the modification of a temporary makes sense. This temporary would live until the evaluation of the full-expression. François Andrieux makes a nice analogy between calling f(MyType{}.mutate())
on one hand and f(my_int + 1)
on the other. I think the simplification holds still as MyType{}.mutate()
can be seen as another temporary as MyType{}
was, like my_int + 1
can be seen as another int
as my_int
was. This is all semantics and opinion-based. The real answer is: (2) because the Standard says so.
And also that literals and temporaries variables are not lvalues, but no reason is given for this statement.
This is true for all temporaries and literals except for string literals. Those are actually lvalues (which is explained below).
Is it because literals and temporaries variables do not have defined storage location? If yes, then where do they reside if not in memory?
Yes. The literal 2
doesn't actually exist; it is just a value in the source code. Since it's a value, not an object, it doesn't have to have any memory associated to it. It can be hard coded into the assembly that the compiler creates, or it could be put somewhere, but since it doesn't have to be, all you can do is treat it as a pure value, not an object.
There is an exemption though and that is string literals. Those actually have storage since a string literal is an array of const char[N]
. You can take the address of a string literal and a string literal can decay into a pointer, so it is an lvalue, even though it doesn't have a name.
Temporaries are also rvalues. Even if they exist as objects, their storage location is ephemeral. They only last until the end of the full expression they are in. You are not allowed to take their address and they also do not have a name. They might not even exist: for instance, in
Foo a = Foo();
The Foo()
can be removed and the code semantically transformed to
Foo a(); // you can't actually do this since it declares a function with that signature.
so now there isn't even a temporary object in the optimized code.
lvalue
stands for locator value and represents an object that occupies some identifiable location in memory.
The term locator value is also used here:
C
The C programming language followed a similar taxonomy, except that the role of assignment was no longer significant: C expressions are categorized between "lvalue expressions" and others (functions and non-object values), where "lvalue" means an expression that identifies an object, a "locator value"[4].
Everything that is not an lvalue
is by exclusion an rvalue
. Every expression is either an lavalue
or rvalue
.
Originally lvalue
term was used in C to indicate values that can stay on the left side of assignment operator. However with the const
keywork this changed. Not all lvalues
can be assigned to. Those that can are called modifiable lvalues
.
And also that literals and temporaries variables are not lvalues, but no reason is given for this statement.
According to this answer literals can be lvalues
in some cases.
- literals of scalar types are
rvalue
because they are of known size and are very likely to be embedded directly into the machine commands on the given hardware architecture. What would be the memory location of5
? - On the contrary, strangely enough, string literals are
lvalues
since they have unpredictable size and there is no other way to represent them apart from as objects in memory.
An lvalue
can be converted to an rvalue
. For example in the following instructions
int a =5;
int b = 3;
int c = a+b;
the operator +
takes two rvalues
. So a
and b
are converted to rvalues
before getting summed. Another example of conversion:
int c = 6;
&c = 4; //ERROR: &c is an rvalue
On the contrary you cannot convert an rvalue
to an lvalue
.
However you can produce a valid lvalue
from an rvalue
for example:
int arr[] = {1, 2};
int* p = &arr[0];
*(p + 1) = 10; // OK: p + 1 is an rvalue, but *(p + 1) is an lvalue
In C++11 rvalues reference are related to the move constructor and move assignment operator.
You can find more details in this clear and well-explained post.