Why is it possible to declare an array in C without defining its length?
It's a convenience feature. The size of the array is deduced from the initializer so you don't have to spell it:
int arr[] = {10, 20, 30, 40, 50};
is equivalent with
int arr[5] = {10, 20, 30, 40, 50};
Another example of this (thanks to Eugene Sh.) is string initializers:
char str[] = "asd";
Is equivalent with
char str[4] = "asd";
One important thing to know is that when used as a type for a function parameter things are different. All of the next forms:
void foo(int v[])
void foo(int v[1])
void foo(int v[5])
void foo(int v[1000])
All are equivalent between themselves and they are transformed into this:
void foo(int* v)
Always use the latter (void foo(int* v)
), never the other ones. Because the first forms make it look like you have an array type, but in reality what you have is a pointer. It's misleading.
To complement the existing answer, quoting from the C11
, chapter §6.7.9, P22
If an array of unknown size is initialized, its size is determined by the largest indexed element with an explicit initializer. The array type is completed at the end of its initializer list.
So, the size of the array will be decide by the "largest indexed element", or, simply speaking, the count of elements present in the initializer list.
It is acceptable, because the size (in bytes) of an integer is known during the compile time and thus the compiler knows how much space is required for that entire list.
But to understand this answer one has to dig a little deeper and ask why it is so important to know the exact size during the compile time. Generically speaking: To define the virtual address space for your program. Part of that is the stack on which local variables are stored and which must not be confused with heap memory (where malloc works). The stack is a LIFO list and also contains all function calls together with its parameters. It is used in the end of a function to jump back, where you came from and has for that a memory address stored. Everything you put on the stack, while you are in your function, has to be freed in order to get to the correct jump-back-address and to avoid a potential segfault.
Fortunately C does this type of memory management automatically for us and frees all of our automatic variables once they are considered to be 'out of scope'. To do that we need the exact size of what we have pushed onto the stack and that is why already the compiler needs to know that size.
To illustrate how the compiler translates your code and hard-codes these numbers, see here:
$ echo "int int_size = sizeof(int); int main(void) { int arr[] = {10, 20, 30, 40, 50}; }" |\
gcc -c -xc -S -o- -masm=intel -
.file ""
.intel_syntax noprefix
.text
.globl main
.type main, @function
# [...] removed int_size here to keep it shorter. its "4" ;)
main:
.LFB0:
.cfi_startproc
push rbp # < backup rbp / stack base pointer
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
mov rbp, rsp # < rsp / stack shift pointer = top of the stack
.cfi_def_cfa_register 6
sub rsp, 32
mov rax, QWORD PTR fs:40
mov QWORD PTR -8[rbp], rax
xor eax, eax
mov DWORD PTR -32[rbp], 10 # < 10 is one element from the array
mov DWORD PTR -28[rbp], 20 # < -28 means relative to the top of the stack
mov DWORD PTR -24[rbp], 30
mov DWORD PTR -20[rbp], 40
mov DWORD PTR -16[rbp], 50
mov eax, 0
mov rdx, QWORD PTR -8[rbp]
xor rdx, QWORD PTR fs:40
je .L3
call __stack_chk_fail@PLT
.L3:
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (GNU) 8.2.1 20181127"
.section .note.GNU-stack,"",@progbits