Why is a statement like 1 + n *= 3 allowed in Ruby?
Checking ruby -y
output, you can see exactly what is happening. Given the source of 1 + age *= 2
, the output suggests this happens (simplified):
tINTEGER
found, recognised as simple_numeric
, which is a numeric
, which is a literal
, which is a primary
. Knowing that +
comes next, primary
is recognised as arg
.
+
found. Can't deal yet.
tIDENTIFIER
found. Knowing that next token is tOP_ASGN
(operator-assignment), tIDENTIFIER
is recognised as user_variable
, and then as var_lhs
.
tOP_ASGN
found. Can't deal yet.
tINTEGER
found. Same as last one, it is ultimately recognised as primary
. Knowing that next token is \n
, primary
is recognised as arg
.
At this moment we have arg + var_lhs tOP_ASGN arg
on stack. In this context, we recognise the last arg
as arg_rhs
. We can now pop var_lhs tOP_ASGN arg_rhs
from stack and recognise it as arg
, with stack ending up as arg + arg
, which can be reduced to arg
.
arg
is then recognised as expr
, stmt
, top_stmt
, top_stmts
. \n
is recognised as term
, then terms
, then opt_terms
. top_stmts opt_terms
are recognised as top_compstmt
, and ultimately program
.
On the other hand, given the source 1 + age * 2
, this happens:
tINTEGER
found, recognised as simple_numeric
, which is a numeric
, which is a literal
, which is a primary
. Knowing that +
comes next, primary
is recognised as arg
.
+
found. Can't deal yet.
tIDENTIFIER
found. Knowing that next token is *
, tIDENTIFIER
is recognised as user_variable
, then var_ref
, then primary
, and arg
.
*
found. Can't deal yet.
tINTEGER
found. Same as last one, it is ultimately recognised as primary
. Knowing that next token is \n
, primary
is recognised as arg
.
The stack is now arg + arg * arg
. arg * arg
can be reduced to arg
, and the resultant arg + arg
can also be reduced to arg
.
arg
is then recognised as expr
, stmt
, top_stmt
, top_stmts
. \n
is recognised as term
, then terms
, then opt_terms
. top_stmts opt_terms
are recognised as top_compstmt
, and ultimately program
.
What's the critical difference? In the first piece of code, age
(a tIDENTIFIER
) is recognised as var_lhs
(left-hand-side of assignment), but in the second one, it's var_ref
(a variable reference). Why? Because Bison is a LALR(1) parser, meaning that it has one-token look-ahead. So age
is var_lhs
because Ruby saw tOP_ASGN
coming up; and it was var_ref
when it saw *
. This comes about because Ruby knows (using the huge state transition table that Bison generates) that one specific production is impossible. Specifically, at this time, the stack is arg + tIDENTIFIER
, and next token is *=
. If tIDENTIFIER
is recognised as var_ref
(which leads up to arg
), and arg + arg
reduced to arg
, then there is no rule that starts with arg tOP_ASGN
; thus, tIDENTIFIER
cannot be allowed to become var_ref
, and we look at the next matching rule (the var_lhs
one).
So Aleksei is partly right in that there is some truth to "when it sees a syntax error, it tries another way", but it is limited to one token into future, and the "attempt" is just a lookup in the state table. Ruby is incapable of complex repair strategies we humans use to understand sentences like "the horse raced past the barn fell", where we happily parse till the last word, then reevaluate the whole sentence when the first parse turns out impossible.
tl;dr: The precedence table is not exactly correct. There is no place in Ruby source where it exists; rather, it is the result of the interplay of various parsing rules. Many of the precedence rules break in when left-hand-side of an assignment is introduced.
The simplified answer is. You can only assign a value to a variable, not to an expression. Therefore the order is 1 + (age *= 2)
. The precedence only comes into play if multiple options are possible. For example age *= 2 + 1
can be seen as (age *= 2) + 1
or age *= (2 + 1)
, since multiple options are possible and the +
has a higher precedence than *=
, age *= (2 + 1)
is used.
NB This answer should not be marked as solving the issue. See the answer by @Amadan for the correct explanation.
I am not sure what “many Ruby documentations” you mentioned, here is the official one.
Ruby parser does its best to understand and successfully parse the input; when it sees a syntax error, it tries another way. That said, syntax errors have greater precedence compared to all operator precedence rules.
Since LHO must be variable, it starts with an assignment. Here is the case when the parsing can be done with a default precedence order and +
is done before *=
:
age = 2
age *= age + 1
#⇒ 6