Why is the .bss segment required?
The reason is to reduce program size. Imagine that your C program runs on an embedded system, where the code and all constants are saved in true ROM (flash memory). In such systems, an initial "copy-down" must be executed to set all static storage duration objects, before main() is called. It will typically go like this pseudo:
for(i=0; i<all_explicitly_initialized_objects; i++)
{
.data[i] = init_value[i];
}
memset(.bss,
0,
all_implicitly_initialized_objects);
Where .data and .bss are stored in RAM, but init_value is stored in ROM. If it had been one segment, then the ROM had to be filled up with a lot of zeroes, increasing ROM size significantly.
RAM-based executables work similarly, though of course they have no true ROM.
Also, memset is likely some very efficient inline assembler, meaning that the startup copy-down can be executed faster.
The .bss
segment is an optimization. The entire .bss
segment is described by a single number, probably 4 bytes or 8 bytes, that gives its size in the running process, whereas the .data
section is as big as the sum of sizes of the initialized variables. Thus, the .bss
makes the executables smaller and quicker to load. Otherwise, the variables could be in the .data
segment with explicit initialization to zeroes; the program would be hard-pressed to tell the difference. (In detail, the address of the objects in .bss
would probably be different from the address if it was in the .data
segment.)
In the first program, a
would be in the .data
segment and b
would be in the .bss
segment of the executable. Once the program is loaded, the distinction becomes immaterial. At run time, b
occupies 20 * sizeof(int)
bytes.
In the second program, var
is allocated space and the assignment in main()
modifies that space. It so happens that the space for var
was described in the .bss
segment rather than the .data
segment, but that doesn't affect the way the program behaves when running.