Understanding char *, char[] and strcpy()
Your understanding is not totally correct, unfortunately.
char *
points at character data, and since there's no const
in there, you can write to the data being pointed to.
However, it's perfectly possible to do this:
char *a = "hello";
which gives you a read/write pointer to read-only data, since string literals are stored in read-only memory, but not "considered" constant by the language's syntax.
It's better to write the above as:
const char *a = "hello";
To make it more clear that you cannot modify the data pointed at by a
.
Also, your examples mixing malloc()
and assignment are wrong.
This:
char *dest = malloc(5);
dest = "FIVE"; /* BAD CODE */
Is bad code, and you should never do that. It simply overwrites the pointer returned by dest
with a pointer to the string "FIVE"
which exists somewhere in (again, read-only) memory as a string literal.
The proper way to initalize newly allocated memory with string data is to use strcpy()
:
char *dest = malloc(5);
if(dest != NULL)
strcpy(dest, "five");
Note that checking the return value of malloc()
is a good idea.
There's no problem doing multiple writes to the same memory, that's a very basic idea in C; variables represent memory, and can be given different values at different times by being "written over".
Something as simple as:
int a = 2;
printf("a=%d\n", a);
a = 4;
printf("a=%d\n", a);
demonstrates this, and it works just fine for strings too of course since they are just blocks of memory.
You can extend the above malloc()
-based example:
char *dest = malloc(5);
if(dest != NULL)
{
strcpy(dest, "five");
printf("dest='%s'\n", dest);
strcpy(dest, "four");
printf("dest='%s'\n", dest);
strcpy(dest, "one");
printf("dest='%s'\n", dest);
}
and it will print:
dest='five'
dest='four'
dest='one'
My understanding is as follows:
char *
points to a string constant, modifying the data it points to is undefined. You can however change where it points to.
Here you refer to an expression like
char * string = "mystring";
You are right that doing string[1]='r';
is undefined. But that is not because of the char *
, but because of the string literal involved in a way that it is put into read-only memory.
Compare this to
char string[] = "mystring";
where I define an array in RAM where the said string is put into. Here it is allowed to do string[1] = 'r';
, because we are in normal data memory.
This seems to support your assumption, but take this:
char string[] = "mystring";
char * string2 = string;
Here string2[1] = 'r';
is valid, because it points to a location where writing is ok as well.
char[] refers to a block of memory that you can change its contents but not what it refers to.
Yes, because there the name is just the name of a variable and not a pointer.
strcpy(dest, src) copies src into dest.
Right.
My question is, is it incorrect to use strcpy() with the dest being a char * that is already pointing to something (as I beleive the old contents will be overwritten by strcpy() - which is undefined behaviour)?
It depends what you mean with "already pointing to something"...
For example:
char *dest = malloc(5); dest = "FIVE"; char *src = malloc(5); src = "NEW!"; strcpy(dest, src); /* Invalid because chars at dest are getting
overwritten? */
Here you again mix up several things.
First, you have dest
point to a brand new chunk of memory. Afterwards, you have it point to somewhere else where you cannot write, and the chunk of memory is lost (memory leak).
The same happens with src
.
So the strcpy()
fails.
You can do
char *dest = malloc(5);
char *src = "NEW!";
strcpy(dest, src);
as here dest
points to a writable place, and src
points to useful data.
A quick analysis:
char *dest = malloc(5);
// 'dest' is set to point to a piece of allocated memory
// (typically located in the heap)
dest = "FIVE";
// 'dest' is set to point to a constant string
// (typically located in the code-section or in the data-section)
You are assigning variable dest
twice, so obviously, the first assignment has no meaning.
It's like writing:
int i = 5;
i = 6;
On top of that, you "lose" the address of the allocated memory, so you will not be able to release it later.