What are the rules for evaluation order in Java?
Eric Lippert's masterful answer is nonetheless not properly helpful because it is talking about a different language. This is Java, where the Java Language Specification is the definitive description of the semantics. In particular, §15.26.1 is relevant because that describes the evaluation order for the =
operator (we all know that it is right-associative, yes?). Cutting it down a little to the bits that we care about in this question:
If the left-hand operand expression is an array access expression (§15.13), then many steps are required:
- First, the array reference subexpression of the left-hand operand array access expression is evaluated. If this evaluation completes abruptly, then the assignment expression completes abruptly for the same reason; the index subexpression (of the left-hand operand array access expression) and the right-hand operand are not evaluated and no assignment occurs.
- Otherwise, the index subexpression of the left-hand operand array access expression is evaluated. If this evaluation completes abruptly, then the assignment expression completes abruptly for the same reason and the right-hand operand is not evaluated and no assignment occurs.
- Otherwise, the right-hand operand is evaluated. If this evaluation completes abruptly, then the assignment expression completes abruptly for the same reason and no assignment occurs.
[… it then goes on to describe the actual meaning of the assignment itself, which we can ignore here for brevity …]
In short, Java has a very closely defined evaluation order that is pretty much exactly left-to-right within the arguments to any operator or method call. Array assignments are one of the more complex cases, but even there it's still L2R. (The JLS does recommend that you don't write code that needs these sorts of complex semantic constraints, and so do I: you can get into more than enough trouble with just one assignment per statement!)
C and C++ are definitely different to Java in this area: their language definitions leave evaluation order undefined deliberately to enable more optimizations. C# is like Java apparently, but I don't know its literature well enough to be able to point to the formal definition. (This really varies by language though, Ruby is strictly L2R, as is Tcl — though that lacks an assignment operator per se for reasons not relevant here — and Python is L2R but R2L in respect of assignment, which I find odd but there you go.)
Let me say this very clearly, because people misunderstand this all the time:
Order of evaluation of subexpressions is independent of both associativity and precedence. Associativity and precedence determine in what order the operators are executed but do not determine in what order the subexpressions are evaluated. Your question is about the order in which subexpressions are evaluated.
Consider A() + B() + C() * D()
. Multiplication is higher precedence than addition, and addition is left-associative, so this is equivalent to (A() + B()) + (C() * D())
But knowing that only tells you that the first addition will happen before the second addition, and that the multiplication will happen before the second addition. It does not tell you in what order A(), B(), C() and D() will be called! (It also does not tell you whether the multiplication happens before or after the first addition.) It would be perfectly possible to obey the rules of precedence and associativity by compiling this as:
d = D() // these four computations can happen in any order
b = B()
c = C()
a = A()
sum = a + b // these two computations can happen in any order
product = c * d
result = sum + product // this has to happen last
All the rules of precedence and associativity are followed there -- the first addition happens before the second addition, and the multiplication happens before the second addition. Clearly we can do the calls to A(), B(), C() and D() in any order and still obey the rules of precedence and associativity!
We need a rule unrelated to the rules of precedence and associativity to explain the order in which the subexpressions are evaluated. The relevant rule in Java (and C#) is "subexpressions are evaluated left to right". Since A() appears to the left of C(), A() is evaluated first, regardless of the fact that C() is involved in a multiplication and A() is involved only in an addition.
So now you have enough information to answer your question. In a[b] = b = 0
the rules of associativity say that this is a[b] = (b = 0);
but that does not mean that the b=0
runs first! The rules of precedence say that indexing is higher precedence than assignment, but that does not mean that the indexer runs before the rightmost assignment.
(UPDATE: An earlier version of this answer had some small and practically unimportant omissions in the section which follows which I have corrected. I've also written a blog article describing why these rules are sensible in Java and C# here: https://ericlippert.com/2019/01/18/indexer-error-cases/)
Precedence and associativity only tell us that the assignment of zero to b
must happen before the assignment to a[b]
, because the assignment of zero computes the value that is assigned in the indexing operation. Precedence and associativity alone say nothing about whether the a[b]
is evaluated before or after the b=0
.
Again, this is just the same as: A()[B()] = C()
-- All we know is that the indexing has to happen before the assignment. We don't know whether A(), B(), or C() runs first based on precedence and associativity. We need another rule to tell us that.
The rule is, again, "when you have a choice about what to do first, always go left to right". However, there is an interesting wrinkle in this specific scenario. Is the side effect of a thrown exception caused by a null collection or out-of-range index considered part of the computation of the left side of the assignment, or part of the computation of the assignment itself? Java chooses the latter. (Of course, this is a distinction that only matters if the code is already wrong, because correct code does not dereference null or pass a bad index in the first place.)
So what happens?
- The
a[b]
is to the left of theb=0
, so thea[b]
runs first, resulting ina[1]
. However, checking the validity of this indexing operation is delayed. - Then the
b=0
happens. - Then the verification that
a
is valid anda[1]
is in range happens - The assignment of the value to
a[1]
happens last.
So, though in this specific case there are some subtleties to consider for those rare error cases that should not be occurring in correct code in the first place, in general you can reason: things to the left happen before things to the right. That's the rule you're looking for. Talk of precedence and associativity is both confusing and irrelevant.
People get this stuff wrong all the time, even people who should know better. I have edited far too many programming books that stated the rules incorrectly, so it is no surprise that lots of people have completely incorrect beliefs about the relationship between precedence/associativity, and evaluation order -- namely, that in reality there is no such relationship; they are independent.
If this topic interests you, see my articles on the subject for further reading:
http://blogs.msdn.com/b/ericlippert/archive/tags/precedence/
They are about C#, but most of this stuff applies equally well to Java.