What is the use of .byte assembler directive in gnu assembly?

Minimal runnable example

.byte spits out bytes wherever you are. Whether there is a label or not pointing to the byte, does not matter.

If you happen to be in the text segment, then that byte might get run like code.

Carl mentioned it, but here is a complete example to let it sink in further: a Linux x86_64 implementation of true with a nop thrown in:

.global _start
_start:
    mov $60, %rax
    nop
    mov $0, %rdi
    syscall

produces the exact same executable as:

.global _start
_start:
    mov $60, %rax
    .byte 0x90
    mov $0, %rdi
    syscall

since nop is encoded as the byte 0x90.

One use case: new instructions

One use case is when new instructions are added to a CPU ISA, but only very edge versions of the assembler would support it.

So project maintainers may choose to inline the bytes directly to make it compilable on older assemblers.

See for example this Spectre workaround on the Linux kernel with the analogous .inst directive: https://github.com/torvalds/linux/blob/94710cac0ef4ee177a63b5227664b38c95bbf703/arch/arm/include/asm/barrier.h#L23

#define CSDB    ".inst  0xe320f014"

A new instruction was added for Spectre, and the kernel decided to hardcode it for the time being.


Here's an example with inline assembly:

#include <stdio.h>
void main() {
   int dst;
   // .byte 0xb8 0x01 0x00 0x00 0x00 = mov $1, %%eax
   asm (".byte 0xb8, 0x01, 0x00, 0x00, 0x00\n\t"
    "mov %%eax, %0"
    : "=r" (dst)
    : : "eax"  // tell the compiler we clobber eax
   );
   printf ("dst value : %d\n", dst);
return;
}

(See compiler asm output and also disassembly of the final binary on the Godbolt compiler explorer.)

You can replace .byte 0xb8, 0x01, 0x00, 0x00, 0x00 with mov $1, %%eax the run result will be the same. This indicated that it can be a byte which can represent some instruction eg- move or others.


There are a few possibilities... here are a couple I can think of off the top of my head:

  1. You could access it relative to a label that comes after the .byte directive. Example:

      .byte 0x0a
    label:
      mov (label - 1), %eax
    
  2. Based on the final linked layout of the program, maybe the .byte directives will get executed as code. Normally you'd have a label in this case too, though...

  3. Some assemblers don't support generating x86 instruction prefixes for operand size, etc. In code written for those assemblers, you'll often see something like:

      .byte 0x66
      mov $12, %eax
    

    To make the assembler emit the code you want to have.