Arduino String memory allocation
This is an excellent question to illustrate the amount of string copying and heap operations (malloc/free) going on when using the Arduino String class.
void loop()
{
Serial.println(foo("def"));
while(1);
}
The compiler will generate loop() something like this:
void loop()
{
// String literal is stored in program memory. Needs to be copied
// to temporary variable. This is done before main() is called.
static const char temp0[4] PROGMEM = "def";
static const char temp1[4];
copy_from_program_memory(temp1, temp0);
// The parameter to the call to foo() is actually a temporary String
// variable constructed from the string literal. Storage is allocated
// on the heap and the assigned from the value of the string literal.
String temp2;
temp2.constructor(temp1);
// The return value of the call to foo() is a String that is passed
// to Serial.println(). This is also a temporary String variable.
String temp3;
temp3.constructor(foo(temp2));
Serial.println(temp3);
// The String class destructor has to be called for the temporary
// String variable so that the string values on heap are deallocated.
temp2.destructor();
temp3.destructor();
}
Note that each call to the constructor involves allocating and copying data to the heap.
String foo(String arg1)
{
String test = "abc";
test = test + arg1;
return test;
}
The function foo() needs to copy strings several times to perform the string concatenation. The operator+ will require an intermediate String copy.
Cheers!
PS: For more details please see the assembly listing below with the calls to String member functions. The compiler reduces inline member functions and reuses temporary local variables. Also the call to foo() is inlined. The member function reserve() is part of the constructor.
00000198 <main>:
1de: 60 df rcall .-320 ; 0xa0 <String::reserve(unsigned int)>
1fc: 50 d2 rcall .+1184 ; 0x69e <strcpy>
212: 46 df rcall .-372 ; 0xa0 <String::reserve(unsigned int)>
230: 36 d2 rcall .+1132 ; 0x69e <strcpy>
248: 62 df rcall .-316 ; 0x10e <String::operator=(String const&)>
26a: 1a df rcall .-460 ; 0xa0 <String::reserve(unsigned int)>
27e: 0f d2 rcall .+1054 ; 0x69e <strcpy>
28e: 3f df rcall .-386 ; 0x10e <String::operator=(String const&)>
294: 60 df rcall .-320 ; 0x156 <String::~String()>
29a: 5d df rcall .-326 ; 0x156 <String::~String()>
2a0: 5a df rcall .-332 ; 0x156 <String::~String()>
Bottom-line is that the String class uses a lot of instruction cycles and memory, and there is a potential risk of heap fragmentation and allocation failure.
The truth here turns out to be a bit more complicated than the other answers express. There's more than one implementation of the Arduino String library, and some are better than others. Which one you get depends on which board you deploy to.
The newer code handles strings a whole lot more efficiently, particularly in terms of memory usage, and changes the whole dynamic around the answer to this question.
Compare the original version of the Arduino String library, the one for AVR, to the version for the ESP8266.
(As of writing this answer, these were this snapshot and this snapshot, respectively.)
Looking in particular at the functions like changeBuffer
you can see that the ESP version contains something called "SSO" ("Small String Optimization") added with this PR here. Basically, for small strings—small enough to fit in the structure would otherwise be used to track the string buffer on the heap—it'll just store the string in that space (in the String object itself) instead.
In practice, this means strings smaller than 12 characters don't require memory allocation; they're stored on the stack inside the String object.
This is a pretty huge deal if you push around a lot of tiny strings, like in the example listed in the question. It means that the code works about as efficiently as you might hope, without problematic side-effects.