Using this pointer causes strange deoptimization in hot loop
Pointer aliasing seems to be the problem, ironically between this
and this->target
. The compiler is taking into account the rather obscene possibility that you initialized:
this->target = &this
In that case, writing to this->target[0]
would alter the contents of this
(and thus, this->target
).
The memory aliasing problem is not restricted to the above. In principle, any use of this->target[XX]
given an (in)appropriate value of XX
might point to this
.
I am better versed in C, where this can be remedied by declaring pointer variables with the __restrict__
keyword.
The issue here is strict aliasing which says that we are allowed to alias through a char* and so that prevents compiler optimization in your case. We are not allowed to alias through a pointer of a different type which would be undefined behavior, normally on SO we see this problem which is users attempting to alias through incompatible pointer types.
It would seem reasonable to implement uint8_t as a unsigned char and if we look at cstdint on Coliru it includes stdint.h which typedefs uint8_t as follows:
typedef unsigned char uint8_t;
if you used another non-char type then the compiler should be able to optimize.
This is covered in the draft C++ standard section 3.10
Lvalues and rvalues which says:
If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined
and includes the following bullet:
- a char or unsigned char type.
Note, I posted a comment on possible work arounds in a question that asks When is uint8_t ≠ unsigned char? and the recommendation was:
The trivial workaround, however, is to use the restrict keyword, or to copy the pointer to a local variable whose address is never taken so that the compiler does not need to worry about whether the uint8_t objects can alias it.
Since C++ does not support the restrict keyword you have to rely on compiler extension, for example gcc uses __restrict__ so this is not totally portable but the other suggestion should be.
Strict aliasing rules allows char*
to alias any other pointer. So this->target
may alias with this
, and in your code method, the first part of the code,
target[0] = t & 0x7;
target[1] = (t >> 3) & 0x7;
target[2] = (t >> 6) & 0x7;
is in fact
this->target[0] = t & 0x7;
this->target[1] = (t >> 3) & 0x7;
this->target[2] = (t >> 6) & 0x7;
as this
may be modified when you modify this->target
content.
Once this->target
is cached into a local variable, the alias is no longer possible with the local variable.