What is global _start in assembly language?
global _start
is just a label that points to a memory address.In the case of _start when it comes to ELF binaries it is the default label used that acts as the address where the program starts.
There is also main
or _main
or main_
is known to the C language, and is called by "startup code" which is "usually" linked to - if you're using C.
Hope this helps.
global
directive is NASM specific. It is for exporting symbols in your code to where it points in the object code generated. Here you mark _start
symbol global so its name is added in the object code (a.o
). The linker (ld
) can read that symbol in the object code and its value so it knows where to mark as an entry point in the output executable. When you run the executable it starts at where marked as _start
in the code.
If a global
directive missing for a symbol, that symbol will not be placed in the object code's export table so linker has no way of knowing about the symbol.
If you want to use a different entry point name other than _start
(which is the default), you can specify -e
parameter to ld like:
ld -e my_entry_point -o out a.o
_start
is used by the default Binutils' ld
linker script as the entry point
We can see the relevant part of that linker script with:
ld -verbose a.o | grep ENTRY
which outputs:
ENTRY(_start)
The ELF file format (and other object format I suppose), explicitly say which address the program will start running at through the e_entry
header field.
ENTRY(_start)
tells the linker to set that entry the address of the symbol _start
when generating the ELF file from object files.
Then when the OS starts running the program (exec
system call on Linux), it parses the ELF file, loads the executable code into memory, and sets the instruction pointer to the specified address.
The -e
flag mentioned by Sedat overrides the default _start
symbol.
You can also replace the entire default linker script with the -T <script>
option, here is a concrete example that sets up some bare metal assembly stuff.
.global
is an assembler directive that marks the symbol as global in the ELF file
The ELF file contains some metadata for every symbol, indicating its visibility.
The easiest way to observe this is with the nm
tool.
For example in a Linux x86_64 GAS freestanding hello world:
main.S
.text
.global _start
_start:
asm_main_after_prologue:
/* write */
mov $1, %rax /* syscall number */
mov $1, %rdi /* stdout */
lea msg(%rip), %rsi /* buffer */
mov $len, %rdx /* len */
syscall
/* exit */
mov $60, %rax /* syscall number */
mov $0, %rdi /* exit status */
syscall
msg:
.ascii "hello\n"
len = . - msg
GitHub upstream
compile and run:
gcc -ffreestanding -static -nostdlib -o main.out main.S
./main.out
nm
gives:
00000000006000ac T __bss_start
00000000006000ac T _edata
00000000006000b0 T _end
0000000000400078 T _start
0000000000400078 t asm_main_after_prologue
0000000000000006 a len
00000000004000a6 t msg
and man nm
tells us that:
If lowercase, the symbol is usually local; if uppercase, the symbol is global (external).
so we see that _global
is visible externally (upper case T
), but the msg
which we didn't mark as .global
isn't (lower case t
).
The linker then knows how to blow up if multiple global symbols with the same name are seen, or do smarter things is more exotic symbol types are seen.
If we don't mark _start
as global, ld
becomes sad and says:
cannot find entry symbol _start
A label is not explicitly global until you declare it to be global so you have to use the global directive.
The global label "_start" is needed by the linker, if there is no global _start address then the linker will complain because it cant find one. You didnt declare _start as a global so it is not visible outside that module/object of code so not visible to the linker.
This is the opposite of C where things are implied to be global unless you declare them to be local
unsigned int hello;
int fun ( int a )
{
return(a+1);
}
hello and fun are global, visible outside the object, but this
static unsigned int hello;
static int fun ( int a )
{
return(a+1);
}
makes them local not visible.
all local:
_start:
hello:
fun:
more_fun:
these are now global available to the linker and other objects
global _start
_start:
global hello
hello:
...