What resides in the different memory types of a microcontroller?
.text
The .text segment contains the actual code, and is programmed into Flash memory for microcontrollers. There may be more than one text segment when there are multiple, non-contiguous blocks of Flash memory; e.g. a start vector and interrupt vectors located at the top of memory, and code starting at 0; or separate sections for a bootstrap and main program.
.bss and .data
There are three types of data that can be allocated external to a function or procedure; the first is uninitialized data (historically called .bss, which also includes the 0 initialized data), and the second is initialized (non-bss), or .data. The name "bss" historically comes from "Block Started by Symbol", used in an assembler some 60 years ago. Both of these areas areas are located in RAM.
As a program is compiled, variables will be allocated to one of these two general areas. During the linking stage, all of the data items will be collected together. All variables which need to be initialized will have a portion of the program memory set aside to hold the initial values, and just before main() is called, the variables will be initialized, typically by a module called crt0. The bss section is initialized to all zeros by the same startup code.
With a few microcontrollers, there are shorter instructions that allow access to the first page (first 256 locations, sometime called page 0) of RAM. The compiler for these processors may reserve a keyword like near
to designate variables to be placed there. Similarly, there are also microcontrollers that can only reference certain areas via a pointer register (requiring extra instructions), and such variables are designated far
. Finally, some processors can address a section of memory bit by bit and the compiler will have a way to specify that (such as the keyword bit
).
So there might be additional segments like .nearbss and .neardata, etc., where these variables are collected.
.rodata
The third type of data external to a function or procedure is like the initialized variables, except it is read-only and cannot be modified by the program. In the C language, these variables are denoted using the const
keyword. They are usually stored as part of the program flash memory. Sometimes they are identified as part of a .rodata (read-only data) segment. On microcontrollers using the Harvard architecture, the compiler must use special instructions to access these variables.
stack and heap
The stack and heap are both placed in RAM. Depending on the architecture of the processor, the stack may grow up, or grow down. If it grows up, it will be placed at the bottom of RAM. If it grows down, it will be placed at the end of RAM. The heap will use the remaining RAM not allocated to variables, and grow the opposite direction of the stack. The maximum size of the stack and heap can usually be specified as linker parameters.
Variables placed on the stack are any variables defined within a function or procedure without the keyword static
. They were once called automatic variables (auto
keyword), but that keyword is not needed. Historically, auto
exists because it was part of the B language which preceded C, and there it was needed. Function parameters are also placed on the stack.
Here is a typical layout for RAM (assuming no special page 0 section):
EEPROM, ROM, and NVRAM
Before Flash memory came along, EEPROM (electrically erasable programmable read-only memory) was used to store the program and const data (.text and .rodata segments). Now there is just a small amount (e.g. 2KB to 8KB bytes) of EEPROM available, if any at all, and it is typically used for storing configuration data or other small amounts of data that need to be retained over a power-down power up cycle. These are not declared as variables in the program, but instead are written to using special registers in the microcontroller. EEPROM may also be implemented in a separate chip and accessed via an SPI or I²C bus.
ROM is essentially the same as Flash, except it is programmed at the factory (not programmable by the user). It is used only for very high volume devices.
NVRAM (non-volatile RAM) is an alternative to EEPROM, and is usually implemented as an external IC. Regular RAM may be considered non-volatile if it is battery-backed up; in that case no special access methods are needed.
Although data can be saved to Flash, Flash memory has a limited number of erase/program cycles (1000 to 10,000) so it's not really designed for that. It also requires blocks of memory to be erased at once, so it's inconvenient to update just a few bytes. It's intended for code and read-only variables.
EEPROM has much higher limits on erase/program cycles (100,000 to 1,000,000) so it is much better for this purpose. If there is EEPROM available on the microcontroller and it's large enough, it's where you want to save non-volatile data. However you will also have to erase in blocks first (typically 4KB) before writing.
If there is no EEPROM or it's too small, then an external chip is needed. An 32KB EEPROM is only 66¢ and can be erased/written to 1,000,000 times. An NVRAM with the same number of erase/program operations is much more expensive (x10) NVRAMs are typically faster for reading than EEPROMs, but slower for writing. They may be written to one byte at a time, or in blocks.
A better alternative to both of these is FRAM (ferroelectric RAM), which has essentially infinite write cycles (100 trillion) and no write delays. It's about the same price as NVRAM, around $5 for 32KB.
Normal embedded system:
Segment Memory Contents
.data RAM Explicitly initialized variables with static storage duration
.bss RAM Zero-initialized variables with static storage duration
.stack RAM Local variables and function call parameters
.heap RAM Dynamically allocated variables (usually not used in embedded systems)
.rodata ROM const variables with static storage duration. String literals.
.text ROM The program. Integer constants. Initializer lists.
In addition, there is usually separate flash segments for start-up code and interrupt vectors.
Explanation:
A variable has static storage duration if it is declared as static
or if it resides at file scope (sometimes sloppily called "global"). C has a rule stating that all static storage duration variables that the programmer did not initialize explicitly must be initialized to zero.
Every static storage duration variable that is initialized to zero, implicitly or explicitly, ends up in .bss
. While those that are explicitly initialized to a non-zero value end up in .data
.
Examples:
static int a; // .bss
static int b = 0; // .bss
int c; // .bss
static int d = 1; // .data
int e = 1; // .data
void func (void)
{
static int x; // .bss
static int y = 0; // .bss
static int z = 1; // .data
static int* ptr = NULL; // .bss
}
Please keep in mind that a very common non-standard setup for embedded systems is to have a "minimal start-up", which means that the program will skip all initialization of objects with static storage duration. Therefore it might be wise to never write programs that relies on the initialization values of such variables, but instead sets them in "run-time" before they are used for the first time.
Examples of the other segments:
const int a = 0; // .rodata
const int b; // .rodata (nonsense code but C allows it, unlike C++)
static const int c = 0; // .rodata
static const int d = 1; // .rodata
void func (int param) // .stack
{
int e; // .stack
int f=0; // .stack
int g=1; // .stack
const int h=param; // .stack
static const int i=1; // .rodata, static storage duration
char* ptr; // ptr goes to .stack
ptr = malloc(1); // pointed-at memory goes to .heap
}
Variables that can go on the stack may often end up in CPU registers during optimization. As a rule of thumb, any variable which doesn't have its address taken can be placed in a CPU register.
Note that pointers are a bit more intricate than other variables, since they allow two different kinds of const
, depending on if the pointed-at data should be read-only, or if the pointer itself should be. It is very important to know the difference so your pointers don't end up in RAM by accident, when you wanted them to be in flash.
int* j=0; // .bss
const int* k=0; // .bss, non-const pointer to const data
int* const l=0; // .rodata, const pointer to non-const data
const int* const m=0; // .rodata, const pointer to const data
void (*fptr1)(void); // .bss
void (*const fptr2)(void); // .rodata
void (const* fptr3)(void); // invalid, doesn't make sense since functions can't be modified
In the case of integer constants, initializer lists, string literals etc, they may end up either in .text or .rodata depending on compiler. Likely, they end up as:
#define n 0 // .text
int o = 5; // 5 goes to .text (part of the instruction)
int p[] = {1,2,3}; // {1,2,3} goes to .text
char q[] = "hello"; // "hello" goes to .rodata
While any data can go into any memory the programmer chooses, generally the system works best (and is intended to be used) where the use profile of the data is matched to the read/write profiles of the memory.
For instance program code is WFRM (write few read many), and there's a lot of it. This fits FLASH nicely. ROM OTOH is W once RM.
Stack and heap are small, with lots of reads and writes. That would fit RAM best.
EEPROM would not suit either of those uses well, but it does suit the profile of small amounts of data perisistent across power-ups, so user specific initialisation data, and perhaps logging results.