What's the purpose of stack pointer alignment in the prologue of main()
The System V AMD64 ABI (x86-64 ABI) requires 16-byte stack alignment. double
requires 8-byte alignment and SSE extensions require 16-byte alignment.
gcc
documentation points it in its documentation for -mpreferred-stack-boundary
option:
-mpreferred-stack-boundary=num
Attempt to keep the stack boundary aligned to a 2 raised to num byte boundary. If -mpreferred-stack-boundary is not specified, the default is 4 (16 bytes or 128 bits).
Warning: When generating code for the x86-64 architecture with SSE extensions disabled, -mpreferred-stack-boundary=3 can be used to keep the stack boundary aligned to 8 byte boundary. Since x86-64 ABI require 16 byte stack alignment, this is ABI incompatible and intended to be used in controlled environment where stack space is important limitation. This option leads to wrong code when functions compiled with 16 byte stack alignment (such as functions from a standard library) are called with misaligned stack. In this case, SSE instructions may lead to misaligned memory access traps. In addition, variable arguments are handled incorrectly for 16 byte aligned objects (including x87 long double and __int128), leading to wrong results. You must build all modules with -mpreferred-stack-boundary=3, including any libraries. This includes the system libraries and startup modules.
Modern versions of the i386 System V ABI have the same 16-byte stack alignment requirement / guarantee as x86-64 System V (which @ouah's answer mentions).
This includes a guarantee that the kernel will have aligned %esp
by 16 at _start
. So CRT startup code that also maintains 16-byte alignment will call main
with the stack 16-byte aligned.
Historically, the i386 System V ABI only required 4-byte stack alignment, and aligning the stack by 16 was just something compilers could choose to do; GCC defaulted to -mpreferred-stack-boundary=4
when it was just a good idea, not the law (on MacOS and Linux).
Some BSD versions I think still don't require 16-byte stack alignment in 32-bit code, so 32-bit code that want to use aligned memory for a double
, int64_t
, or especially an XMM vector, does need to manually align the stack instead of relying on incoming stack alignment.
But even on modern Linux, GCC's 32-bit-mode (-m32
) behaviour for main
doesn't assume that main
's caller (or the kernel) follows the ABI, and manually aligns the stack.
See Responsibility of stack alignment in 32-bit x86 assembly for more; another question where the obsolete instruction led to confusion based on the assumption that it was needed.
GCC on x86-64 does not do this, and does just take advantage of the fact that 16-byte stack alignment has always been a requirement in the x86-64 System V ABI. (And the Windows x64 ABI).