Is the strict aliasing rule incorrectly specified?

Starting with your example:

int strict_aliasing_example(int *i, float *f)
{
    *i = 1;
    *f = 1.0;
    return (*i);
}

Let's first acknowledge that, in the absence of any unions, this would violate the strict aliasing rule if i and f both point to the same object; assuming the object has no declared type, then *i = 1 sets the effective type to int and *f = 1.0 then sets it to float, and the final return (*i) then accesses an object with effective type of float via an lvalue of type int, which is clearly not allowed.

The question is about whether this would still amount to a strict-aliasing violation if both i and f point to members of the same union. For this not to be the case, it would either have to be that there is some special exemption from the strict aliasing rule that applies in this situation, or that accessing the object via *i does not (also) access the same object as *f.

On union member access via the "." member access operator, the standard says (6.5.2.3):

A postfix expression followed by the . operator and an identifier designates a member of a structure or union object. The value is that of the named member (95) and is an lvalue if the first expression is an lvalue.

The footnote 95 referred to in above says:

If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called ‘‘type punning’’). This might be a trap representation.

This is clearly intended to allow type punning via a union, but it should be noted that (1) footnotes are non-normative, that is, they are not supposed to proscribe behaviour, but rather they should clarify the intention of some part of the text in accordance with the rest of the specification, and (2) this allowance for type punning via a union is deemed by compiler vendors as applying only for access via the union member access operator - since otherwise strict aliasing is pretty useless for optimisation, as just about any two pointers potentially refer to different members of the same union (your example is a case in point).

So at this point, we can say that:

  • the code in your example is explicitly allowed by a non-normative footnote
  • the normative text on the other hand seems to disallow your example (due to strict aliasing), assuming that accessing one member of a union also constitutes access to another - but more on this shortly

Does accessing one member of a union actually access the others, though? If not, the strict aliasing rule isn't concerned with the example. (If it does, the strict aliasing rule, problematically, disallows just about any type-punning via a union).

A union is defined as (6.2.5 para 20):

A union type describes an overlapping nonempty set of member objects

And note that (6.7.2.1 para 16):

The value of at most one of the members can be stored in a union object at any time

Since access is (3):

〈execution-time action〉 to read or modify the value of an object

... and, since non-active union members do not have a stored value, then presumably accessing one member does not constitute access to the other members!

However, the definition of member access (6.5.2.3, quoted above) says "The value is that of the named member" (this is the precise statement that footnote 95 is attached to) - if the member has no value, what then? Footnote 95 gives an answer but as I've noted it is not supported by the normative text.

In any case, nothing in the text would seem to imply that reading or modifying a union member "via the member object" (i.e. directly via an expression using the member access operator) should be any different than reading or modifying it via pointer to that same member. The consensus understanding applied by compiler vendors, which allows them to perform optimisations under the assumption that pointers of different types do not alias, and that requires type punning be performed only via expressions involving member access, is not supported by the text of the standard.

If footnote 95 is considered normative, your example is perfectly fine code without undefined behaviour (unless the value of (*i) is a trap representation), according to the rest of the text. However, if footnote 95 is not considered normative, there is an attempted access to an object which has no stored value and the behaviour then is at best unclear (though the strict aliasing rule is arguably not relevant).

In the understanding of compiler vendors currently, your example has undefined behaviour, but since this isn't specified in the standard it's not clear exactly what constraint the code violates.

Personally, I think the "fix" to the standard is to:

  • disallow access to a non-active union member except via lvalue conversion of a member access expression, or via assignment where the left-hand-side is a member access expression (an exception to this could perhaps be made for when the member in question has character type, since that would not have an effect on possible optimisations due to a similar exception in the strict aliasing rule itself)
  • specify in the normative text that the value of a non-active member is as is currently described by footnote 95

That would make your example not a violation of the strict aliasing rule, but rather a violation of the constraint that a non-active union member must be accessed only via an expression containing the member access operator (and appropriate member).

Therefore, to answer your question - Is the strict aliasing rule incorrectly specified? - no, the strict aliasing rule is not relevant to this example because the objects accessed by the two pointer dereferences are separate objects and, even though they overlap in storage, only one of them has a value at a time. However, the union member access rules are incorrectly specified.

A note on Defect Report 236:

Arguments about union semantics invariably refer to DR 236 at some point. Indeed, your example code is superficially very similar to the code in that Defect Report. I would note that:

  1. The example in DR 236 is not about type-punning. It is about whether it is ok to assign to a non-active union member via a pointer to that member. The code in question is subtly different to that in the question here, since it does not attempt to access the "original" union member again after writing to the second member. Thus, despite the structural similarity in the example code, the Defect Report is largely unrelated to your question.
  2. "Committee believes that Example 2 violates the aliasing rules in 6.5 paragraph 7" - this indicates that the committee believes that writing a "non-active" union member, but not via an expression containing a member access of the union object, is a strict-aliasing violation. As I've detailed above, this is not supported by the text of the standard.
  3. "In order to not violate the rules, function f in example should be written as" - i.e. you must use the union object (and the "." operator) to change the active member type; this is in agreement with the "fix" to the standard I proposed above.
  4. The Committee Response in DR 236 claims that "Both programs invoke undefined behavior". It has no explanation for why the first does so, and its explanation for why the 2nd does so seems to be wrong.

Under the definition of union members in §6.5.2.3:

3 A postfix expression followed by the . operator and an identifier designates a member of a structure or union object. ...

4 A postfix expression followed by the -> operator and an identifier designates a member of a structure or union object. ...

See also §6.2.3 ¶1:

  • the members of structures or unions; each structure or union has a separate name space for its members (disambiguated by the type of the expression used to access the member via the . or -> operator);

It is clear that footnote 95 refers to the access of a union member with the union in scope and using the . or -> operator.

Since assignments and accesses to the bytes comprising the union are not made through union members but through pointers, your program does not invoke the aliasing rules of union members (including those clarified by footnote 95).

Further, normal aliasing rules are violated since the effective type of the object after *f = 1.0 is float, but its stored value is accessed by an lvalue of type int (see §6.5 ¶7).

Note: All references cite this C11 standard draft.