Why does it make a difference if left and right shift are used together in one expression or not?
This little test is actually more subtle than it looks as the behavior is implementation defined:
unsigned char x = 255;
no ambiguity here,x
is anunsigned char
with value255
, typeunsigned char
is guaranteed to have enough range to store255
.printf("%x\n", x);
This producesff
on standard output but it would be cleaner to writeprintf("%hhx\n", x);
asprintf
expects anunsigned int
for conversion%x
, whichx
is not. Passingx
might actually pass anint
or anunsigned int
argument.unsigned char tmp = x << 7;
To evaluate the expressionx << 7
,x
being anunsigned char
first undergoes the integer promotions defined in the C Standard 6.3.3.1: If anint
can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to anint
; otherwise, it is converted to anunsigned int
. These are called the integer promotions.So if the number of value bits in
unsigned char
is smaller or equal to that ofint
(the most common case currently being 8 vs 31),x
is first promoted to anint
with the same value, which is then shifted left by7
positions. The result,0x7f80
, is guaranteed to fit in theint
type, so the behavior is well defined and converting this value to typeunsigned char
will effectively truncate the high order bits of the value. If typeunsigned char
has 8 bits, the value will be128
(0x80
), but if typeunsigned char
has more bits, the value intmp
can be0x180
,0x380
,0x780
,0xf80
,0x1f80
,0x3f80
or even0x7f80
.If type
unsigned char
is larger thanint
, which can occur on rare systems wheresizeof(int) == 1
,x
is promoted tounsigned int
and the left shift is performed on this type. The value is0x7f80U
, which is guaranteed to fit in typeunsigned int
and storing that totmp
does not actually lose any information since typeunsigned char
has the same size asunsigned int
. Sotmp
would have the value0x7f80
in this case.unsigned char y = tmp >> 7;
The evaluation proceeds the same as above,tmp
is promoted toint
orunsigned int
depending on the system, which preserves its value, and this value is shifted right by 7 positions, which is fully defined because7
is less than the width of the type (int
orunsigned int
) and the value is positive. Depending on the number of bits of typeunsigned char
, the value stored iny
can be1
,3
,7
,15
,31
,63
,127
or255
, the most common architecture will havey == 1
.printf("%x\n", y);
again, it would be better t writeprintf("%hhx\n", y);
and the output may be1
(most common case) or3
,7
,f
,1f
,3f
,7f
orff
depending on the number of value bits in typeunsigned char
.unsigned char z = (x << 7) >> 7;
The integer promotion is performed onx
as described above, the value (255
) is then shifted left 7 bits as anint
or anunsigned int
, always producing0x7f80
and then right shifted by 7 positions, with a final value of0xff
. This behavior is fully defined.printf("%x\n", z);
Once more, the format string should beprintf("%hhx\n", z);
and the output would always beff
.
Systems where bytes have more than 8 bits are becoming rare these days, but some embedded processors, such as specialized DSPs still do that. It would take a perverse system to fail when passed an unsigned char
for a %x
conversion specifier, but it is cleaner to either use %hhx
or more portably write printf("%x\n", (unsigned)z);
Shifting by 8
instead of 7
in this example would be even more contrived. It would have undefined behavior on systems with 16-bit int
and 8-bit char
.
The 'intermediate' values in your last case are (full) integers, so the bits that are shifted 'out of range' of the original unsigned char
type are retained, and thus they are still set when the result is converted back to a single byte.
From this C11 Draft Standard:
6.5.7 Bitwise shift operators
...
3 The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand ...
However, in your first case, unsigned char tmp = x << 7;
, the tmp
loses the six 'high' bits when the resultant 'full' integer is converted (i.e. truncated) back to a single byte, giving a value of 0x80
; when this is then right-shifted in unsigned char y = tmp >> 7;
, the result is (as expected) 0x01
.
The shift operator is not defined for the char
types. The value of any char
operand is converted to int
and the result of the expression is converted the char
type.
So, when you put the left and right shift operators in the same expression the calculation will be performed as type int
(without loosing any bit), and the result will be converted to char
.