Access an array from the end in C?

Although the Standard specifies that arrayLvalue[i] means (*((arrayLvalue)+(i))), which would be processed by taking the address of the first element of arrayLvalue, gcc sometimes treats [], when applied to an array-type value or lvalue, as an operator which behaves line an indexed version of .member syntax, yielding a value or lvalue which the compiler will treat as being part of the array type. I don't know if this is ever observable when the array-type operand isn't a member of a struct or union, but the effects are clearly demonstrable in cases where it is, and I know of nothing that would guarantee that similar logic wouldn't be applied to nested arrays.

struct foo {unsigned char x[12]};
int test1(struct foo *p1, struct foo *p2)
{
    p1->x[0] = 1;
    p2->x[1] = 2;
    return p1->x[0];
}
int test2(struct foo *p1, struct foo *p2)
{
    char *p;
    p1->x[0] = 1;
    (&p2->x[0])[1] = 2;
    return p1->x[0];
}

The code gcc generates for test1 will always return 1, while the generated code for test2 will return whatever is in p1->x[0]. I am unaware of anything in the Standard or the documentation for gcc that would suggest the two functions should behave differently, nor how one should force a compiler to generate code that would accommodate the case where p1 and p2 happen to identify overlapping parts of an allocated block in the event that should be necessary. Although the optimization used in test1() would be reasonable for the function as written, I know of no documented interpretation of the Standard that would treat that case as UB but define the behavior of the code if it wrote to p2->x[0] instead of p2->x[1].

The C standard does not define the behavior of (&array)[1].

Consider &array + 1. This is defined by the C standard, for two reasons:

When doing pointer arithmetic, the result is defined for results from the first element (with index 0) of an array to one beyond the last element.
When doing pointer arithmetic, a pointer to a single object behaves like a pointer to an array with one element. In this case, &array is a pointer to a single object (that is itself an array, but the pointer arithmetic is for the pointer-to-the-array, not a pointer-to-an-element).

So &array + 1 is defined pointer arithmetic that points just beyond the end of array.

However, by definition of the subscript operator, (&array)[1] is *(&array + 1). While the &array + 1 is defined, applying * to it is not. C 2018 6.5.6 8 explicitly tells us, about result of pointer arithmetic, “If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated.”

Because of the way most compilers are designed, the code in the question may move data around as you desire. However, this is not a behavior you should rely on. You can obtain a good pointer to just beyond the last element of the array with char *End = array + sizeof array / sizeof *array;. Then you can use End[-1] to refer to the last element, End[-2] to refer to the penultimate element, and so on.

Access an array from the end in C?

Tags:

C

Arrays

Pointers

Language Lawyer

Related

Recent Posts