Why is my SRAM so quickly filled? There are not more than 60 bytes

When you are asking yourself what is eating so much RAM, the first step is to look at the symbol table in the ELF file. If you use a makefile, you probably know where to find the ELF file. If you are using the Arduino IDE, go to File / Preferences, check “Show verbose output during compilation”, compile and look at the output: you will see the temporary directory where the compiler puts your ELF file.

Now run the command avr-nm -Crtd --size-sort your_elf_file and look for symbols of type 'D' (data), 'V' (vtable) and 'B' (BSS), in either upper or lower case. On a Unix-style OS you would pipe through grep -i ' [dbv] '. Running this on your program gives:

00000068 B tx_buffer
00000068 B rx_buffer
00000034 B Serial
00000021 B lcd
00000016 V vtable for HardwareSerial
00000008 V vtable for LiquidCrystal
00000004 B timer0_overflow_count
00000004 B timer0_millis
00000002 D __malloc_margin
00000002 D __malloc_heap_start
00000002 D __malloc_heap_end
00000002 B __flp
00000002 B __brkval
00000001 b timer0_fract
00000001 D encr
00000001 B debug
00000001 B EEPROM

Obviously, this cannot account for the 1,023 bytes of static RAM your program is using. What this command misses is the literal arrays and strings. These can be seen with the command avr-objdump -j .data -s your_elf_file. The literal strings are quite obvious in the output, the literal arrays less so. Running this on your program gives a long listing starting with

Contents of section .data:
800100 00000005 2000010e 040d0102 0f0b0803  .... ...........
800110 0a060c05 09000700 0f07040e 020d010a  ................

Now, on the source code we see:

const uint8_t SBoxes[8][4][16] PROGMEM = {
{{14,  4,  13,  1,   2, 15,  11,  8,   3, 10,   6, 12,   5,  9,   0,  7},
 { 0, 15,   7,  4,  14,  2,  13,  1,  10,  6,  12, 11,   9,  5,   3,  8},
 { 4,  1,  14,  8,  13,  6,   2, 11,  15, 12,   9,  7,   3, 10,   5,  0},
 {15, 12,   8,  2,   4,  9,   1,  7,   5, 11,   3, 14,  10,  0,   6, 13} },
 ...

If you translate this to hexadecimal, you get 0e 04 0d 01 02 0f..., which appears also by the end of the first line of previous listing. So there is your culprit: all the big PROGMEM arrays. The compiler does not honor the PROGMEM attribute on local variables.

My first thought was to make the arrays global, and this does solve the problem. However, as pointed out by Mikael Patel in a comment, the documentation on PROGMEM states that “variables must be either globally defined, OR defined with the static keyword, in order to work with PROGMEM.” Then, making the arrays static const PROGMEM is a cleaner solution.


To elaborate on Edgar Bonet's answer and my comment under it, you cannot usefully put PROGMEM variables as local variables because local (non-static) variables have to be allocated on the stack.

void lookUpInSBox(size_t which, byte *address, byte* binaryOutcome, size_t addressFrom){

   ...

    const uint8_t SBoxes[8][4][16] PROGMEM = {

                             /*S1*/

   { {14,  4,  13,  1,   2, 15,  11,  8,   3, 10,   6, 12,   5,  9,   0,  7},
     { 0, 15,   7,  4,  14,  2,  13,  1,  10,  6,  12, 11,   9,  5,   3,  8},
     { 4,  1,  14,  8,  13,  6,   2, 11,  15, 12,   9,  7,   3, 10,   5,  0},
     {15, 12,   8,  2,   4,  9,   1,  7,   5, 11,   3, 14,  10,  0,   6, 13}    },

You could make them global. But if you prefer not to do that, put static in front of them. Doing that on your code reduced it to 247 bytes.

Global variables use 247 bytes (12%) of dynamic memory, leaving 1,801 bytes for local variables. Maximum is 2,048 bytes.

You expect a bit more than the 60 you counted. The serial buffers take a bit (128 bytes).

See Putting constant data into program memory (PROGMEM) for more information. On a tiny sketch I found the following 346 bytes already in use:

  • 34 bytes for the HardwareSerial instance (Serial)
  • 64 bytes for the Serial transmit buffer
  • 64 bytes for the Serial receive buffer
  • 4 bytes for the Serial transmit buffer head and tail pointers
  • 4 bytes for the Serial receive buffer head and tail pointers
  • 9 bytes for keeping track of millis / micros
  • 4 bytes for memory allocation (__malloc_heap_start, __malloc_margin)
  • 128 bytes for the heap safety margin
  • 6 bytes for a few nested function calls (main -> setup -> getFreeMemory)
  • 16 bytes for the compiler vtable for HardwareSerial
  • 4 bytes for variables __brkval and __flp (used in memdebug)
  • 2 bytes pushed onto the stack in main (to save registers)
  • 2 bytes pushed onto the stack in setup (to save registers)
  • 4 bytes pushed onto the stack in getFreeMemory (to save registers)
  • 1 byte because the stack pointer starts at 0x8FF rather than 0x900

The sketch on my page used dynamic memory allocation so it used a bit more than yours does (for keeping track of it).


Once it starts printing gibberish or looping (that means it prints something and starts again from the beginning of the program) I know I have run out of SRAM

I'm a bit puzzled by all the new/delete you are doing. Why do that? For example:

byte *E;
E = new byte[48/8];  
for(size_t i = 0; i < 48; i++){
    insertBit( E, i, bitValue( R, E_BIT[i]-1 ) );
}

// XOR Kn and  E(Rn-1)
byte KxorE[48/8]; 
for (size_t i = 0; i < 48; i++){
    insertBit( KxorE, i, bitValue( K, i ) ^ bitValue( E, i ) );
}
delete[] E;

Massive amounts of dynamic memory allocation may fragment memory. Can't you rewrite to not do that? You can pass arrays by reference, it just seems weird (and slow) to do all that allocation.