Convert Little Endian to Big Endian

OP's sample code is incorrect.

Endian conversion works at the bit and 8-bit byte level. Most endian issues deal with the byte level. OP code is doing a endian change at the 4-bit nibble level. Recommend instead:

// Swap endian (big to little) or (little to big)
uint32_t num = 9;
uint32_t b0,b1,b2,b3;
uint32_t res;

b0 = (num & 0x000000ff) << 24u;
b1 = (num & 0x0000ff00) << 8u;
b2 = (num & 0x00ff0000) >> 8u;
b3 = (num & 0xff000000) >> 24u;

res = b0 | b1 | b2 | b3;

printf("%" PRIX32 "\n", res);

If performance is truly important, the particular processor would need to be known. Otherwise, leave it to the compiler.

[Edit] OP added a comment that changes things.
"32bit numerical value represented by the hexadecimal representation (st uv wx yz) shall be recorded in a four-byte field as (st uv wx yz)."

It appears in this case, the endian of the 32-bit number is unknown and the result needs to be store in memory in little endian order.

uint32_t num = 9;
uint8_t b[4];
b[0] = (uint8_t) (num >>  0u);
b[1] = (uint8_t) (num >>  8u);
b[2] = (uint8_t) (num >> 16u);
b[3] = (uint8_t) (num >> 24u);

[2016 Edit] Simplification

... The type of the result is that of the promoted left operand.... Bitwise shift operators C11 §6.5.7 3

Using a u after the shift constants (right operands) results in the same as without it.

b3 = (num & 0xff000000) >> 24u;
b[3] = (uint8_t) (num >> 24u);
// same as 
b3 = (num & 0xff000000) >> 24;
b[3] = (uint8_t) (num >> 24);

I think you can use function htonl(). Network byte order is big endian.

Sorry, my answer is a bit too late, but it seems nobody mentioned built-in functions to reverse byte order, which in very important in terms of performance.

Most of the modern processors are little-endian, while all network protocols are big-endian. That is history and more on that you can find on Wikipedia. But that means our processors convert between little- and big-endian millions of times while we browse the Internet.

That is why most architectures have a dedicated processor instructions to facilitate this task. For x86 architectures there is BSWAP instruction, and for ARMs there is REV. This is the most efficient way to reverse byte order.

To avoid assembly in our C code, we can use built-ins instead. For GCC there is __builtin_bswap32() function and for Visual C++ there is _byteswap_ulong(). Those function will generate just one processor instruction on most architectures.

Here is an example:

#include <stdio.h>
#include <inttypes.h>

int main()
    uint32_t le = 0x12345678;
    uint32_t be = __builtin_bswap32(le);

    printf("Little-endian: 0x%" PRIx32 "\n", le);
    printf("Big-endian:    0x%" PRIx32 "\n", be);

    return 0;

Here is the output it produces:

Little-endian: 0x12345678
Big-endian:    0x78563412

And here is the disassembly (without optimization, i.e. -O0):

        uint32_t be = __builtin_bswap32(le);
   0x0000000000400535 <+15>:    mov    -0x8(%rbp),%eax
   0x0000000000400538 <+18>:    bswap  %eax
   0x000000000040053a <+20>:    mov    %eax,-0x4(%rbp)

There is just one BSWAP instruction indeed.

So, if we do care about the performance, we should use those built-in functions instead of any other method of byte reversing. Just my 2 cents.


