Why doesn't a+++++b work?
Compilers are written in stages. The first stage is called the lexer and turns characters into a symbolic structure. So "++" becomes something like an enum SYMBOL_PLUSPLUS
. Later, the parser stage turns this into an abstract syntax tree, but it can't change the symbols. You can affect the lexer by inserting spaces (which end symbols unless they are in quotes).
Normal lexers are greedy (with some exceptions), so your code is being interpreted as
a++ ++ +b
The input to the parser is a stream of symbols, so your code would be something like:
[ SYMBOL_NAME(name = "a"),
SYMBOL_PLUS_PLUS,
SYMBOL_PLUS_PLUS,
SYMBOL_PLUS,
SYMBOL_NAME(name = "b")
]
Which the parser thinks is syntactically incorrect. (EDIT based on comments: Semantically incorrect because you cannot apply ++ to an r-value, which a++ results in)
a+++b
is
a++ +b
Which is ok. So are your other examples.
printf("%d",a+++++b);
is interpreted as (a++)++ + b
according to the Maximal Munch Rule!.
++
(postfix) doesn't evaluate to an lvalue
but it requires its operand to be an lvalue
.
! 6.4/4 says the next preprocessing token is the longest sequence of characters that could constitute a preprocessing token"
The lexer uses what's generally called a "maximum munch" algorithm to create tokens. That means as it's reading characters in, it keeps reading characters until it encounters something that can't be part of the same token as what it already has (e.g., if it's been reading digits so what it has is a number, if it encounters an A
, it knows that can't be part of the number. so it stops and leaves the A
in the input buffer to use as the beginning of the next token). It then returns that token to the parser.
In this case, that means +++++
gets lexed as a ++ ++ + b
. Since the first post-increment yields an rvalue, the second can't be applied to it, and the compiler gives an error.
Just FWIW, in C++ you can overload operator++
to yield an lvalue, which allows this to work. For example:
struct bad_code {
bad_code &operator++(int) {
return *this;
}
int operator+(bad_code const &other) {
return 1;
}
};
int main() {
bad_code a, b;
int c = a+++++b;
return 0;
}
The compiles and runs (though it does nothing) with the C++ compilers I have handy (VC++, g++, Comeau).