What are the differences between a+i and &a[i] for pointer arithmetic in C++?
TL;DR: a+i
and &a[i]
are both well-formed and produce a null pointer when a
is a null pointer and i
is 0, according to (the intent of) the standard, and all compilers agree.
a+i
is obviously well-formed per [expr.add]/4 of the latest draft standard:
When an expression J that has integral type is added to or subtracted from an expression P of pointer type, the result has the type of P.
- If P evaluates to a null pointer value and J evaluates to 0, the result is a null pointer value.
- [...]
&a[i]
is tricky. Per [expr.sub]/1, a[i]
is equivalent to *(a+i)
, thus &a[i]
is equivalent to &*(a+i)
. Now the standard is not quite clear about whether &*(a+i)
is well-formed when a+i
is a null pointer. But as @n.m. points out in comment, the intent as recorded in cwg 232 is to permit this case.
Since core language UB is required to be caught in a constant expression ([expr.const]/(4.6)), we can test whether compilers think these two expressions are UB.
Here's the demo, if the compilers think the constant expression in static_assert
is UB, or if they think the result is not true
, then they must produce a diagnostic (error or warning) per standard:
(note that this uses single-parameter static_assert and constexpr lambda which are C++17 features, and default lambda argument which is also pretty new)
static_assert(nullptr == [](char* a=nullptr, int i=0) {
return a+i;
}());
static_assert(nullptr == [](char* a=nullptr, int i=0) {
return &a[i];
}());
From https://godbolt.org/z/hhsV4I, it seems all compilers behave uniformly in this case, producing no diagnostics at all (which surprises me a bit).
However, this is different from the offset
case. The implementation posted in that question explicitly creates a reference (which is necessary to sidestep user-defined operator&
), and thus is subject to the requirements on references.
In the C++ standard, section [expr.sub]/1 you can read:
The expression
E1[E2]
is identical (by definition) to*((E1)+(E2))
.
This means that &a[i]
is exactly the same as &*(a+i)
. So you would dereference *
a pointer first and get the address &
second. In case the pointer is invalid (i.e. nullptr
, but also out of range), this is UB.
a+i
is based on pointer arithmetics. At first it looks less dangerous since there is no dereferencing that would be UB for sure. However, it may also be UB (see [expr.add]/4:
When an expression that has integral type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the expression P points to element x[i] of an array object x with n elements, the expressions P + J and J + P (where J has the value j) point to the (possibly-hypothetical) element x[i + j] if 0 ≤ i + j ≤ n; otherwise, the behavior is undefined. Likewise, the expression P - J points to the (possibly-hypothetical) element x[i − j] if 0 ≤ i − j ≤ n; otherwise, the behavior is undefined.
So, while the semantics behind these two expression are slightly different, I would say that the result is the same in the end.