How Should I Define/Declare String Constants
Pointer and arrays are different. Defining string constants as pointers or arrays fits different purposes.
When you define a global string constant that is not subject to change, I would recommend you make it a const array:
const char product_name[] = "The program version 3";
Defining it as const char *product_name = "The program version 3";
actually defines 2 objects: the string constant itself, which will reside in a constant segment, and the pointer which can be changed to point to another string or set to NULL
.
Conversely, defining a string constant as a local variable would be better done as a local pointer variable of type const char *
, initialized with the address of a string constant:
int main() {
const char *s1 = "world";
printf("Hello %s\n", s1);
return 0;
}
If you define this one as an array, depending on the compiler and usage inside the function, the code will make space for the array on the stack and initialize it by copying the string constant into it, a more costly operation for long strings.
Note also that const char const *s3 = "baz";
is a redundant form of const char *s3 = "baz";
. It is different from const char * const s3 = "baz";
which defines a constant pointer to a constant array of characters.
Finally, string constants are immutable and as such should have type const char []
. The C Standard purposely allows programmers to store their addresses into non const pointers as in char *s2 = "hello";
to avoid producing warnings for legacy code. In new code, it is highly advisable to always use const char *
pointers to manipulate string constants. This may force you to declare function arguments as const char *
when the function does not change the string contents. This process is known as constification and avoid subtile bugs.
Note that some functions violate this const
propagation: strchr()
does not modify the string received, declared as const char *
, but returns a char *
. It is therefore possible to store a pointer to a string constant into a plain char *
pointer this way:
char *p = strchr("Hello World\n", 'H');
This problem is solved in C++ via overloading. C programmers must deal with this as a shortcoming. An even more annoying situation is that of strtol()
where the address of a char *
is passed and a cast is required to preserve proper constness.
The linked article explores a small artificial situation, and the difference demonstrated vanishes if you insert const
after *
in const char *ptr = "Lorum ipsum";
(tested in Apple LLVM 10.0.0 with clang-1000.11.45.5).
The fact the compiler had to load ptr
arose entirely from the fact it could be changed in some other module not visible to the compiler. Making the pointer const
eliminates that, and the compiler can prepare the address of the string directly, without loading the pointer.
If you are going to declare a pointer to a string and never change the pointer, then declare it as static const char * const ptr = "string";
, and the compiler can happily provide the address of the string whenever the value of ptr
is used. It does not need to actually load the contents of ptr
from memory, since it can never change and will be known to point to wherever the compiler chooses to store the string. This is then the same as static const char array[] = "string";
—whenever the address of the array is needed, the compiler can provide it from its knowledge of where it chose to store the array.
Furthermore, with the static
specifier, ptr
cannot be known outside the translation unit (the file being compiled), so the compiler can remove it during optimization (as long as you have not taken its address, perhaps when passing it to another routine outside the translation unit). The result should be no differences between the pointer method and the array method.
Rule of thumb: Tell the compiler as much as you know about stuff: If it will never change, mark it const
. If it is local to the current module, mark it static
. The more information the compiler has, the more it can optimize.
From the performance perspective, this is a fairly small optimization which makes sense for low-level code that needs to run with the lowest possible latency.
However, I would argue that const char s3[] = "bux";
is better from the semantic perspective, because the type of the right hand side is closer to type of the left hand side. For that reason, I think it makes sense to declare string constants with the array syntax.