Why is a statement like 1 + n *= 3 allowed in Ruby?

Checking ruby -y output, you can see exactly what is happening. Given the source of 1 + age *= 2, the output suggests this happens (simplified):

tINTEGER found, recognised as simple_numeric, which is a numeric, which is a literal, which is a primary. Knowing that + comes next, primary is recognised as arg.

+ found. Can't deal yet.

tIDENTIFIER found. Knowing that next token is tOP_ASGN (operator-assignment), tIDENTIFIER is recognised as user_variable, and then as var_lhs.

tOP_ASGN found. Can't deal yet.

tINTEGER found. Same as last one, it is ultimately recognised as primary. Knowing that next token is \n, primary is recognised as arg.

At this moment we have arg + var_lhs tOP_ASGN arg on stack. In this context, we recognise the last arg as arg_rhs. We can now pop var_lhs tOP_ASGN arg_rhs from stack and recognise it as arg, with stack ending up as arg + arg, which can be reduced to arg.

arg is then recognised as expr, stmt, top_stmt, top_stmts. \n is recognised as term, then terms, then opt_terms. top_stmts opt_terms are recognised as top_compstmt, and ultimately program.

On the other hand, given the source 1 + age * 2, this happens:

tINTEGER found, recognised as simple_numeric, which is a numeric, which is a literal, which is a primary. Knowing that + comes next, primary is recognised as arg.

+ found. Can't deal yet.

tIDENTIFIER found. Knowing that next token is *, tIDENTIFIER is recognised as user_variable, then var_ref, then primary, and arg.

* found. Can't deal yet.

tINTEGER found. Same as last one, it is ultimately recognised as primary. Knowing that next token is \n, primary is recognised as arg.

The stack is now arg + arg * arg. arg * arg can be reduced to arg, and the resultant arg + arg can also be reduced to arg.

What's the critical difference? In the first piece of code, age (a tIDENTIFIER) is recognised as var_lhs (left-hand-side of assignment), but in the second one, it's var_ref (a variable reference). Why? Because Bison is a LALR(1) parser, meaning that it has one-token look-ahead. So age is var_lhs because Ruby saw tOP_ASGN coming up; and it was var_ref when it saw *. This comes about because Ruby knows (using the huge state transition table that Bison generates) that one specific production is impossible. Specifically, at this time, the stack is arg + tIDENTIFIER, and next token is *=. If tIDENTIFIER is recognised as var_ref (which leads up to arg), and arg + arg reduced to arg, then there is no rule that starts with arg tOP_ASGN; thus, tIDENTIFIER cannot be allowed to become var_ref, and we look at the next matching rule (the var_lhs one).

So Aleksei is partly right in that there is some truth to "when it sees a syntax error, it tries another way", but it is limited to one token into future, and the "attempt" is just a lookup in the state table. Ruby is incapable of complex repair strategies we humans use to understand sentences like "the horse raced past the barn fell", where we happily parse till the last word, then reevaluate the whole sentence when the first parse turns out impossible.

tl;dr: The precedence table is not exactly correct. There is no place in Ruby source where it exists; rather, it is the result of the interplay of various parsing rules. Many of the precedence rules break in when left-hand-side of an assignment is introduced.

The simplified answer is. You can only assign a value to a variable, not to an expression. Therefore the order is 1 + (age *= 2). The precedence only comes into play if multiple options are possible. For example age *= 2 + 1 can be seen as (age *= 2) + 1 or age *= (2 + 1), since multiple options are possible and the + has a higher precedence than *=, age *= (2 + 1) is used.

NB This answer should not be marked as solving the issue. See the answer by @Amadan for the correct explanation.

I am not sure what “many Ruby documentations” you mentioned, here is the official one.

Ruby parser does its best to understand and successfully parse the input; when it sees a syntax error, it tries another way. That said, syntax errors have greater precedence compared to all operator precedence rules.

Since LHO must be variable, it starts with an assignment. Here is the case when the parsing can be done with a default precedence order and + is done before *=:

age = 2
age *= age + 1
#⇒ 6

Why is a statement like 1 + n *= 3 allowed in Ruby?

Tags:

Ruby

Related

Recent Posts