What registers to save in the ARM C calling convention?

The answers of CesarB and Pavel provided quotes from AAPCS, but open issues remain. Does the callee save r9? What about r12? What about r14? Furthermore, the answers were very general, and not specific to the arm-eabi toolchain as requested. Here's a practical approach to find out which register are callee-saved and which are not.

The following C code contain an inline assembly block, that claims to modify registers r0-r12 and r14. The compiler will generate the code to save the registers required by the ABI.

void foo() {
  asm volatile ( "nop" : : : "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7", "r8", "r9", "r10", "r11", "r12", "r14");
}

Use the command line arm-eabi-gcc-4.7 -O2 -S -o - foo.c and add the switches for your platform (such as -mcpu=arm7tdmi for example). The command will print the generated assembly code on STDOUT. It may look something like this:

foo:
    stmfd   sp!, {r4, r5, r6, r7, r8, r9, sl, fp, lr}
    nop
    ldmfd   sp!, {r4, r5, r6, r7, r8, r9, sl, fp, lr}
    bx  lr

Note, that the compiler generated code saves and restores r4-r11. The compiler does not save r0-r3, r12. That it restores r14 (alias lr) is purely accidental as I know from experience that the exit code may also load the saved lr into r0 and then do a "bx r0" instead of "bx lr". Either by adding the -mcpu=arm7tdmi -mno-thumb-interwork or by using -mcpu=cortex-m4 -mthumb we obtain slightly different assembly code that looks like this:

foo:
    stmfd   sp!, {r4, r5, r6, r7, r8, r9, sl, fp, lr}
    nop
    ldmfd   sp!, {r4, r5, r6, r7, r8, r9, sl, fp, pc}

Again, r4-r11 are saved and restored. But r14 (alias lr) is not restored.

To summarize:

  • r0-r3 are not callee-saved
  • r4-r11 are callee-saved
  • r12 (alias ip) is not callee-saved
  • r13 (alias sp) is callee-saved
  • r14 (alias lr) is not callee-saved
  • r15 (alias pc) is the program counter and is set to the value of lr prior to the function call

This holds at least for arm-eabi-gcc's default's. There are command line switches (in particular the -mabi switch) that may influence the results.


It depends on the ABI for the platform you are compiling for. On Linux, there are two ARM ABIs; the old one and the new one. AFAIK, the new one (EABI) is in fact ARM's AAPCS. The complete EABI definitions currently live here on ARM's infocenter.

From the AAPCS, §5.1.1:

  • r0-r3 are the argument and scratch registers; r0-r1 are also the result registers
  • r4-r8 are callee-save registers
  • r9 might be a callee-save register or not (on some variants of AAPCS it is a special register)
  • r10-r11 are callee-save registers
  • r12-r15 are special registers

A callee-save register must be saved by the callee (in opposition to a caller-save register, where the caller saves the register); so, if this is the ABI you are using, you do not have to save r10 before calling another function (the other function is responsible for saving it).

Edit: Which compiler you are using makes no difference; gcc in particular can be configured for several different ABIs, and it can even be changed on the command line. Looking at the prologue/epilogue code it generates is not that useful, since it is tailored for each function and the compiler can use other ways of saving a register (for instance, saving it in the middle of a function).


Terminology: "callee-save" is a synonym for "non-volatile" or "call-preserved": What are callee and caller saved registers?
When making a function call, you can assume that the values in r4-r11 (except maybe r9) are still there after (call-preserved), but not for r0-r3 (call-clobbered / volatile).


For 64-bit ARM, A64 (from Procedure Call Standard for the ARM 64-bit Architecture)

There are thirty-one, 64-bit, general-purpose (integer) registers visible to the A64 instruction set; these are labeled r0-r30. In a 64-bit context these registers are normally referred to using the names x0-x30; in a 32-bit context the registers are specified by using w0-w30. Additionally, a stack-pointer register, SP, can be used with a restricted number of instructions.

  • SP The Stack Pointer
  • r30 LR The Link Register
  • r29 FP The Frame Pointer
  • r19…r28 Callee-saved registers
  • r18 The Platform Register, if needed; otherwise a temporary register.
  • r17 IP1 The second intra-procedure-call temporary register (can be used by call veneers and PLT code); at other times may be used as a temporary register.
  • r16 IP0 The first intra-procedure-call scratch register (can be used by call veneers and PLT code); at other times may be used as a temporary register.
  • r9…r15 Temporary registers
  • r8 Indirect result location register
  • r0…r7 Parameter/result registers

The first eight registers, r0-r7, are used to pass argument values into a subroutine and to return result values from a function. They may also be used to hold intermediate values within a routine (but, in general, only between subroutine calls).

Registers r16 (IP0) and r17 (IP1) may be used by a linker as a scratch register between a routine and any subroutine it calls. They can also be used within a routine to hold intermediate values between subroutine calls.

The role of register r18 is platform specific. If a platform ABI has need of a dedicated general purpose register to carry inter-procedural state (for example, the thread context) then it should use this register for that purpose. If the platform ABI has no such requirements, then it should use r18 as an additional temporary register. The platform ABI specification must document the usage for this register.

SIMD

The ARM 64-bit architecture also has a further thirty-two registers, v0-v31, which can be used by SIMD and Floating-Point operations. The precise name of the register will change indicating the size of the access.

Note: Unlike in AArch32, in AArch64 the 128-bit and 64-bit views of a SIMD and Floating-Point register do not overlap multiple registers in a narrower view, so q1, d1 and s1 all refer to the same entry in the register bank.

The first eight registers, v0-v7, are used to pass argument values into a subroutine and to return result values from a function. They may also be used to hold intermediate values within a routine (but, in general, only between subroutine calls).

Registers v8-v15 must be preserved by a callee across subroutine calls; the remaining registers (v0-v7, v16-v31) do not need to be preserved (or should be preserved by the caller). Additionally, only the bottom 64-bits of each value stored in v8-v15 need to be preserved; it is the responsibility of the caller to preserve larger values.


32-bit ARM calling conventions are specified by AAPCS

From the AAPCS, §5.1.1 Core registers:
  • r0-r3 are the argument and scratch registers; r0-r1 are also the result registers
  • r4-r8 are callee-save registers
  • r9 might be a callee-save register or not (on some variants of AAPCS it is a special register)
  • r10-r11 are callee-save registers
  • r12-r15 are special registers

From the AAPCS, §5.1.2.1 VFP register usage conventions:

  • s16–s31 (d8–d15, q4–q7) must be preserved
  • s0–s15 (d0–d7, q0–q3) and d16–d31 (q8–q15) do not need to be preserved

Original post:
arm-to-c-calling-convention-neon-registers-to-save


64-bit ARM calling conventions are specified by AAPCS64

General-purpose Registers section specifies what registers need be preserved.
  • r0-r7 are parameter/result registers
  • r9-r15 are temporary registers
  • r19-r28 are callee-saved registers.
  • All others (r8, r16-r18, r29, r30, SP) have special meaning and some might be treated as temporary registers.

SIMD and Floating-Point Registers specifies Neon and floating point registers.