printf adds extra `FFFFFF` to hex print from a char array
Sign extension. Your compiler is implementing char
as a signed char
. When you pass the chars to printf
they are all being sign extended during their promotion to int
s. When the first bit is a 0 this doesn't matter, because it gets extended with 0
s.
0xAF
in binary is 10101111
Since the first bit is a 1
, when passing it to printf
it is extended with all 1
s in the conversion to int
making it 11111111111111111111111110101111
, the hex value you have.
Solution: Use unsigned char
(instead of char
) to prevent the sign extension from occurring in the call
const unsigned char raw[] = {0x20,0x00,0xAF,0x00,0x69,0x00,0x33,0x00,0x5A,0x00};
All of these values in your original example are being sign extended, it's just that 0xAF
is the only one with a 1
in the first bit.
Another simpler example of the same behavior (live link):
signed char c = 0xAF; // probably gives an overflow warning
int i = c; // extra 24 bits are all 1
assert( i == 0xFFFFFFAF );
The printf()
is a variadic function and its additional arguments (corresponding with ...
part of its prototype) are subject to default argument promotions, thus char
is promoted to int
.
As your char
has signed1, two's complement representation the most significant bit is set to one for 0xAF
element. During promotion signed bit is propagated, resulting 0xFFFFFFAF
of int
type, as presumably sizeof(int) = 4
in your implementation.
By the way you are invoking undefined behaviour, since %X
format specifier should be used for object of type unsigned int
or at least for int
with MSB that is unset (this is common, widely accepted practice).
As suggested you may consider use of unambiguous unsigned char
type.
1) Implementation may choose between signed and unsigned represention of char
. It's rather common that char
is signed, but you cannot take it for granted for every other compiler on the planet. Some of them may allow to choose between these two modes, as mentioned in Jens's answer.
That's because 0xAF when converted from a signed character to a signed integer is negative (it is sign extended), and the %02X
format is for unsigned arguments and prints the converted value as FFFFFFAF
.
The extra characters appear because printf %x
will never silently truncate digits off of a value. Values which are non-negative get sign extended as well, but that's just adding zero bits and the value fits in 2 hex digits, so printf %02
can do with a two digit output.
Note that there are 2 C dialects: one where plain char
is signed, and one where it is unsigned. In yours it is signed. You may change it using an option, e.g. gcc and clang support -funsigned-char
and -fsigned-char
.