What happens before main()?

It is completely dependent on the compiler and architecture, but generally that code initializes the most basic hardware required for the rest of the code to run. The code for example:

  • Defines the reset vectors

  • Defines the layout of data in memory (many systems use a linker script instead)

  • Defines the addresses of interrupt service routines in a big table (the interrupt vector table)

  • Initializes CPU registers, e.g. the stack pointer

  • Configures the core clock

In addition, that section also serves the runtime needs of the programming language used. It:

  • Initializes whatever function parameter passing system used

  • Initializes global variables by e.g. copying flash contents to RAM and zero-initializing memory

  • If dynamic memory allocation is used, initializes the heap

  • If floating point math is enabled, initializes the FPU (if available) or initializes the floating point library

  • If exceptions are used, initializes exception handling.

Ubuntu 20.04 glibc 2.31 RTFS + GDB

glibc does some setup before main so that some of its functionalities will work. Let's try to track down the source code for that.


#include <stdio.h>

int main() {
    return 0;

Compile and debug:

gcc -ggdb3 -O0 -std=c99 -Wall -Wextra -pedantic -o hello.out hello.c
gdb hello.out

Now in GDB:

b main
bt -past-main


#0  main () at hello.c:3
#1  0x00007ffff7dc60b3 in __libc_start_main (main=0x555555555149 <main()>, argc=1, argv=0x7fffffffbfb8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffbfa8) at ../csu/libc-start.c:308
#2  0x000055555555508e in _start ()

This already contains the line of the caller of main: https://github.com/cirosantilli/glibc/blob/glibc-2.31/csu/libc-start.c#L308.

The function has a billion ifdefs as can be expected from the level of legacy/generality of glibc, but some key parts which seem to take effect for us should simplify to:

# define LIBC_START_MAIN __libc_start_main

LIBC_START_MAIN (int (*main) (int, char **, char **),
         int argc, char **argv,

      /* Initialize some stuff. */

      result = main (argc, argv, __environ MAIN_AUXVEC_PARAM);
  exit (result);

Before __libc_start_main are are already at _start, which by adding gcc -Wl,--verbose we know is the entry point because the linker script contains:


and is therefore is the actual very first instruction executed after the dynamic loader finishes.

To confirm that in GDB, we an get rid of the dynamic loader by compiling with -static:

gcc -ggdb3 -O0 -std=c99 -Wall -Wextra -pedantic -o hello.out hello.c
gdb hello.out

and then make GDB stop at the very first instruction executed with starti and print the first instructions:

display/12i $pc

which gives:

=> 0x401c10 <_start>:   endbr64 
   0x401c14 <_start+4>: xor    %ebp,%ebp
   0x401c16 <_start+6>: mov    %rdx,%r9
   0x401c19 <_start+9>: pop    %rsi
   0x401c1a <_start+10>:        mov    %rsp,%rdx
   0x401c1d <_start+13>:        and    $0xfffffffffffffff0,%rsp
   0x401c21 <_start+17>:        push   %rax
   0x401c22 <_start+18>:        push   %rsp
   0x401c23 <_start+19>:        mov    $0x402dd0,%r8
   0x401c2a <_start+26>:        mov    $0x402d30,%rcx
   0x401c31 <_start+33>:        mov    $0x401d35,%rdi
   0x401c38 <_start+40>:        addr32 callq 0x4020d0 <__libc_start_main>

By grepping the source for _start and focusing on x86_64 hits we see that this seems to correspond to sysdeps/x86_64/start.S:58:

ENTRY (_start)
    /* Clearing frame pointer is insufficient, use CFI.  */
    cfi_undefined (rip)
    /* Clear the frame pointer.  The ABI suggests this be done, to mark
       the outermost frame obviously.  */
    xorl %ebp, %ebp

    /* Extract the arguments as encoded on the stack and set up
       the arguments for __libc_start_main (int (*main) (int, char **, char **),
           int argc, char *argv,
           void (*init) (void), void (*fini) (void),
           void (*rtld_fini) (void), void *stack_end).
       The arguments are passed via registers and on the stack:
    main:       %rdi
    argc:       %rsi
    argv:       %rdx
    init:       %rcx
    fini:       %r8
    rtld_fini:  %r9
    stack_end:  stack.  */

    mov %RDX_LP, %R9_LP /* Address of the shared library termination
                   function.  */
#ifdef __ILP32__
    mov (%rsp), %esi    /* Simulate popping 4-byte argument count.  */
    add $4, %esp
    popq %rsi       /* Pop the argument count.  */
    /* argv starts just at the current stack top.  */
    mov %RSP_LP, %RDX_LP
    /* Align the stack to a 16 byte boundary to follow the ABI.  */
    and  $~15, %RSP_LP

    /* Push garbage because we push 8 more bytes.  */
    pushq %rax

    /* Provide the highest stack address to the user code (for stacks
       which grow downwards).  */
    pushq %rsp

#ifdef PIC
    /* Pass address of our own entry points to .fini and .init.  */
    mov __libc_csu_fini@GOTPCREL(%rip), %R8_LP
    mov __libc_csu_init@GOTPCREL(%rip), %RCX_LP

    mov main@GOTPCREL(%rip), %RDI_LP
    /* Pass address of our own entry points to .fini and .init.  */
    mov $__libc_csu_fini, %R8_LP
    mov $__libc_csu_init, %RCX_LP

    mov $main, %RDI_LP

    /* Call the user's main function, and exit with its value.
       But let the libc call main.  Since __libc_start_main in
       libc.so is called very early, lazy binding isn't relevant
       here.  Use indirect branch via GOT to avoid extra branch
       to PLT slot.  In case of static executable, ld in binutils
       2.26 or above can convert indirect branch into direct
       branch.  */
    call *__libc_start_main@GOTPCREL(%rip)

which ends up calling __libc_start_main as expected.

Unfortunately -static makes the bt from main not show as much info:

#0  main () at hello.c:3
#1  0x0000000000402560 in __libc_start_main ()
#2  0x0000000000401c3e in _start ()

If we remove -static and start from starti, we get instead:

=> 0x7ffff7fd0100 <_start>:     mov    %rsp,%rdi
   0x7ffff7fd0103 <_start+3>:   callq  0x7ffff7fd0df0 <_dl_start>
   0x7ffff7fd0108 <_dl_start_user>:     mov    %rax,%r12
   0x7ffff7fd010b <_dl_start_user+3>:   mov    0x2c4e7(%rip),%eax        # 0x7ffff7ffc5f8 <_dl_skip_args>
   0x7ffff7fd0111 <_dl_start_user+9>:   pop    %rdx

By grepping the source for _dl_start_user this seems to come from sysdeps/x86_64/dl-machine.h:L147

/* Initial entry point code for the dynamic linker.
   The C function `_dl_start' is the real entry point;
   its return value is the user program's entry point.  */
#define RTLD_START asm ("\n\
    .align 16\n\
.globl _start\n\
.globl _dl_start_user\n\
    movq %rsp, %rdi\n\
    call _dl_start\n\
    # Save the user entry point address in %r12.\n\
    movq %rax, %r12\n\
    # See if we were run as a command with the executable file\n\
    # name as an extra leading argument.\n\
    movl _dl_skip_args(%rip), %eax\n\
    # Pop the original argument count.\n\
    popq %rdx\n\

and this is presumably the dynamic loader entry point.

If we break at _start and continue, this seems to end up in the same location as when we used -static, which then calls __libc_start_main.


  • commented on concrete easy-to-understand examples of what glibc is doing before main. This gives some ideas: https://stackoverflow.com/questions/53570678/what-happens-before-main-in-c/53571224#53571224
  • make GDB show the source itself without us having to look at it separately, possibly with us building glibc ourselves: https://stackoverflow.com/questions/10412684/how-to-compile-my-own-glibc-c-standard-library-from-source-and-use-it/52454710#52454710
  • understand how the above source code maps to objects such as crti.o that can be seen with gcc --verbose main.c and which end up getting added to the final link

Somewhat related question: Who receives the value returned by main()?

main() is an ordinary C function, so it requires certain things to be initialized before it is called. These are related to:

  • Setting up a valid stack
  • Creating a valid argument list (usually on the stack)
  • Initializing the interrupt-handling hardware
  • Initializing global and static variables (including library code)

The last item includes such things as setting up a memory pool that malloc() and free() can use, if your environment supports dynamic memory allocation. Similarly, any form of "standard I/O" that your system might have access to will also be initialized.

Pretty much anything else is going to be application-dependent, and will have to be initialized from within main(), before you enter your "main loop".


