Does Linux kernel have main function?

start_kernel

On 4.2, start_kernel from init/main.c is a considerable initialization process and could be compared to a main function.

It is the first arch independent code to run, and sets up a large part of the kernel. So much like main, start_kernel is preceded by some lower level setup code (done in the crt* objects in userland main), after which the "main" generic C code runs.

How start_kernel gets called in x86_64

arch/x86/kernel/vmlinux.lds.S, a linker script, sets:

ENTRY(phys_startup_64)

and

phys_startup_64 = startup_64 - LOAD_OFFSET;

and:

#define LOAD_OFFSET __START_KERNEL_map

arch/x86/include/asm/page_64_types.h defines __START_KERNEL_map as:

#define __START_KERNEL_map  _AC(0xffffffff80000000, UL)

which is the kernel entry address. TODO how is that address reached exactly? I have to understand the interface Linux exposes to bootloaders.

arch/x86/kernel/vmlinux.lds.S sets the very first bootloader section as:

.text :  AT(ADDR(.text) - LOAD_OFFSET) {
    _text = .;
    /* bootstrapping code */
    HEAD_TEXT

include/asm-generic/vmlinux.lds.h defines HEAD_TEXT:

#define HEAD_TEXT  *(.head.text)

arch/x86/kernel/head_64.S defines startup_64. That is the very first x86 kernel code that runs. It does a lot of low level setup, including segmentation and paging.

That is then the first thing that runs because the file starts with:

.text
__HEAD
.code64
.globl startup_64

and include/linux/init.h defines __HEAD as:

#define __HEAD      .section    ".head.text","ax"

so the same as the very first thing of the linker script.

At the end it calls x86_64_start_kernel a bit awkwardly with and lretq:

movq    initial_code(%rip),%rax
pushq   $0      # fake return address to stop unwinder
pushq   $__KERNEL_CS    # set correct cs
pushq   %rax        # target address in negative space
lretq

and:

.balign 8
GLOBAL(initial_code)
.quad   x86_64_start_kernel

arch/x86/kernel/head64.c defines x86_64_start_kernel which calls x86_64_start_reservations which calls start_kernel.

arm64 entry point

The very first arm64 that runs on an v5.7 uncompressed kernel is defined at https://github.com/cirosantilli/linux/blob/v5.7/arch/arm64/kernel/head.S#L72 so either the add x13, x18, #0x16 or b stext depending on CONFIG_EFI:

    __HEAD
_head:
    /*
     * DO NOT MODIFY. Image header expected by Linux boot-loaders.
     */
#ifdef CONFIG_EFI
    /*
     * This add instruction has no meaningful effect except that
     * its opcode forms the magic "MZ" signature required by UEFI.
     */
    add x13, x18, #0x16
    b   stext
#else
    b   stext               // branch to kernel start, magic
    .long   0               // reserved
#endif
    le64sym _kernel_offset_le       // Image load offset from start of RAM, little-endian
    le64sym _kernel_size_le         // Effective size of kernel image, little-endian
    le64sym _kernel_flags_le        // Informative flags, little-endian
    .quad   0               // reserved
    .quad   0               // reserved
    .quad   0               // reserved
    .ascii  ARM64_IMAGE_MAGIC       // Magic number
#ifdef CONFIG_EFI
    .long   pe_header - _head       // Offset to the PE header.

This is also the very first byte of an uncompressed kernel image.

Both of those cases jump to stext which starts the "real" action.

As mentioned in the comment, these two instructions are the first 64 bytes of a documented header described at: https://github.com/cirosantilli/linux/blob/v5.7/Documentation/arm64/booting.rst#4-call-the-kernel-image

arm64 first MMU enabled instruction: __primary_switched

I think it is __primary_switched in head.S:

/*
 * The following fragment of code is executed with the MMU enabled.
 *
 *   x0 = __PHYS_OFFSET
 */
__primary_switched:

At this point, the kernel appears to create page tables + maybe relocate itself such that the PC addresses match the symbols of the vmlinux ELF file. Therefore at this point you should be able to see meaningful function names in GDB without extra magic.

arm64 secondary CPU entry point

secondary_holding_pen defined at: https://github.com/cirosantilli/linux/blob/v5.7/arch/arm64/kernel/head.S#L691

Entry procedure further described at: https://github.com/cirosantilli/linux/blob/v5.7/arch/arm64/kernel/head.S#L691


With main() you propably mean what main() is to a program, namely its "entry point".

For a module that is init_module().

From Linux Device Driver's 2nd Edition:

Whereas an application performs a single task from beginning to end, a module registers itself in order to serve future requests, and its "main" function terminates immediately. In other words, the task of the function init_module (the module's entry point) is to prepare for later invocation of the module's functions; it's as though the module were saying, "Here I am, and this is what I can do." The second entry point of a module, cleanup_module, gets invoked just before the module is unloaded. It should tell the kernel, "I'm not there anymore; don't ask me to do anything else."


Fundamentally, there is nothing special about a routine being named main(). As alluded to above, main() serves as the entry point for an executable load module. However, you can define different entry points for a load module. In fact, you can define more than one entry point, for example, refer to your favorite dll.

From the operating system's (OS) point of view, all it really needs is the address of the entry point of the code that will function as a device driver. The OS will pass control to that entry point when the device driver is required to perform I/O to the device.

A system programmer defines (each OS has its own method) the connection between a device, a load module that functions as the device's driver, and the name of the entry point in the load module.

Each OS has its own kernel (obviously) and some might/maybe start with main() but I would be surprised to find a kernel that used main() other than in a simple one, such as UNIX! By the time you are writing kernel code you have long moved past the requirement to name every module you write as main().

Hope this helps?

Found this code snippet from the kernel for Unix Version 6. As you can see main() is just another program, trying to get started!

main()
{
     extern schar;
     register i, *p;
     /*
     * zero and free all of core
     */

     updlock = 0;
     i = *ka6 + USIZE;
     UISD->r[0] = 077406;
     for(;;) {
        if(fuibyte(0) < 0) break;
        clearsig(i);
        maxmem++;
        mfree(coremap, 1, i);
         i++;
     }
     if(cputype == 70) 
     for(i=0; i<62; i=+2) {
       UBMAP->r[i] = i<<12;
       UBMAP->r[i+1] = 0;
      }

    // etc. etc. etc.

Several ways to look at it:

  1. Device drivers are not programs. They are modules that are loaded into another program (the kernel). As such, they do not have a main() function.

  2. The fact that all programs must have a main() function is only true for userspace applications. It does not apply to the kernel, nor to device drivers.