What is the use of .byte assembler directive in gnu assembly?
Minimal runnable example
.byte
spits out bytes wherever you are. Whether there is a label or not pointing to the byte, does not matter.
If you happen to be in the text segment, then that byte might get run like code.
Carl mentioned it, but here is a complete example to let it sink in further: a Linux x86_64 implementation of true
with a nop
thrown in:
.global _start
_start:
mov $60, %rax
nop
mov $0, %rdi
syscall
produces the exact same executable as:
.global _start
_start:
mov $60, %rax
.byte 0x90
mov $0, %rdi
syscall
since nop
is encoded as the byte 0x90
.
One use case: new instructions
One use case is when new instructions are added to a CPU ISA, but only very edge versions of the assembler would support it.
So project maintainers may choose to inline the bytes directly to make it compilable on older assemblers.
See for example this Spectre workaround on the Linux kernel with the analogous .inst
directive: https://github.com/torvalds/linux/blob/94710cac0ef4ee177a63b5227664b38c95bbf703/arch/arm/include/asm/barrier.h#L23
#define CSDB ".inst 0xe320f014"
A new instruction was added for Spectre, and the kernel decided to hardcode it for the time being.
Here's an example with inline assembly:
#include <stdio.h>
void main() {
int dst;
// .byte 0xb8 0x01 0x00 0x00 0x00 = mov $1, %%eax
asm (".byte 0xb8, 0x01, 0x00, 0x00, 0x00\n\t"
"mov %%eax, %0"
: "=r" (dst)
: : "eax" // tell the compiler we clobber eax
);
printf ("dst value : %d\n", dst);
return;
}
(See compiler asm output and also disassembly of the final binary on the Godbolt compiler explorer.)
You can replace .byte 0xb8, 0x01, 0x00, 0x00, 0x00
with mov $1, %%eax
the run result will be the same. This indicated that it can be a byte which can represent some instruction eg- move or others.
There are a few possibilities... here are a couple I can think of off the top of my head:
You could access it relative to a label that comes after the
.byte
directive. Example:.byte 0x0a label: mov (label - 1), %eax
Based on the final linked layout of the program, maybe the
.byte
directives will get executed as code. Normally you'd have a label in this case too, though...Some assemblers don't support generating x86 instruction prefixes for operand size, etc. In code written for those assemblers, you'll often see something like:
.byte 0x66 mov $12, %eax
To make the assembler emit the code you want to have.