What is the best alternative to strncpy()?
As an illustration of having to use an alternative to strncpy()
, consider Git 2.19 (Q3 2018) which finds that it is too easy to misuse system API functions such as strcat(); strncpy()
; ... these selected functions are now forbidden in this codebase and will cause a compilation failure.
That patch does list several alternatives, which makes it relevant for this question.
See commit e488b7a, commit cc8fdae, commit 1b11b64 (24 Jul 2018), and commit c8af66a (26 Jul 2018) by Jeff King (peff
).
(Merged by Junio C Hamano -- gitster
-- in commit e28daf2, 15 Aug 2018)
banned.h
: markstrncpy()
as bannedThe
strncpy()
function is less horrible thanstrcpy()
, but is still pretty easy to misuse because of its funny termination semantics.
Namely, that if it truncates it omits the NUL terminator, and you must remember to add it yourself. Even if you use it correctly, it's sometimes hard for a reader to verify this without hunting through the code.
If you're thinking about using it, consider instead:
strlcpy()
if you really just need a truncated but NUL-terminated string (we provide a compat version, so it's always available)xsnprintf()
if you're sure that what you're copying should fitstrbuf
orxstrfmt()
if you need to handle arbitrary-length heap-allocated strings.Note that there is one instance of
strncpy
incompat/regex/regcomp.c
, which is fine (it allocates a sufficiently large string before copying).
But this doesn't trigger the ban-list even when compiling withNO_REGEX=1
, because:
- we don't use git-compat-util.h when compiling it (instead we rely on the system includes from the upstream library); and
- It's in an "
#ifdef DEBUG
" blockSince it's doesn't trigger the
banned.h
code, we're better off leaving it to keep our divergence from upstream minimal.
Note: 1+ year later, with Git 2.21 (Q1 2019), the "strncat()
" function itself is now also among the banned functions.
See commit ace5707 (02 Jan 2019) by Eric Wong (ele828
).
(Merged by Junio C Hamano -- gitster
-- in commit 81bf66b, 18 Jan 2019)
banned.h
: markstrncat()
as banned
strncat()
has the same quadratic behavior asstrcat()
and is difficult-to-read and bug-prone. While it hasn't yet been a problem in Git iself,strncat()
found it's way into 'master
' ofcgit
and caused segfaults on my system.
With Git 2.24 (Q4 2019), it uses the explicit form of 'vsprintf
' as the banned version of itself, not 'sprintf
'.
See commit 60d198d (25 Aug 2019) by Taylor Blau (ttaylorr
).
(Merged by Junio C Hamano -- gitster
-- in commit 37801f0, 30 Sep 2019)
If the length of the string you desire to copy is unknown, you can use snprintf
here. This function sends formatted output to str. It acts similarily to sprintf()
, but instead does not write more bytes allocated by str. If the resulting string is longer than n-1
characters, then the remaining characters are left out. It also always includes the null terminator \0
, unless the buffer size is 0
.
This would be a alternative to strncpy()
or strcpy()
, if you really don't want to use it. However, manually adding a null terminator at the end of your string with strcpy()
is always a simple, efficient approach. It is very normal in C to add a null terminator at the end of any processed string.
Here is a basic example of using sprintf()
:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define SIZE 1024
int main(void) {
const size_t N = SIZE;
char str[N];
const char *example = "Hello World";
snprintf(str, sizeof(str), "%s", example);
printf("String = %s, Length = %zu\n", str, strlen(str));
return 0;
}
Which prints out:
String = Hello World, Length = 11
This example shows that snprintf()
copied over "Hello World"
into str
, and also added a \0
terminator at the end.
Note: strlen()
only works on null terminated strings, and will cause undefined behaviour if the string is not null terminated. snprintf()
also needs more error checking, which can be found on the man page.
As others have said, this is not an efficient approach, but it is there if you go looking.
If the behavior you want is a truncating version of strcpy
that copies the longest initial prefix of the source string into a buffer of known size, there are multiple options for you:
You can write a tailor made function that does the job:
char *safe_strcpy(char *dest, size_t size, char *src) { if (size > 0) { size_t i; for (i = 0; i < size - 1 && src[i]; i++) { dest[i] = src[i]; } dest[i] = '\0'; } return dest; }
Most BSD systems have a function
strlcpy(char *dest, const char *src, size_t n);
that operates the same. The order of its arguments is confusing asn
is usually the size of thedest
array, but comes after thesrc
argument.You can use
strncat()
:char *safe_strcpy(char *dest, size_t size, char *src) { if (size > 0) { *dest = '\0'; return strncat(dest, src, size - 1); } return dest; }
You can use
snprintf()
orsprintf()
, but it feels like using a hydraulic press to drive in a nail:snprintf(dest, size, "%s", src);
Alternately:
if (size > 0) { sprintf(dest, "%.*s", (int)(size - 1), src); }
You can use
strlen()
andmemcpy()
, but this is only possible if you know that the source pointer points to a null terminated string. It is also less efficient than both of the above solutions if the source string is much longer than the destination array:char *safe_strcpy(char *dest, size_t size, char *src) { if (size > 0) { size_t len = strlen(src); if (len >= size) len = size - 1; memcpy(dest, src, len); dest[len] = '\0'; } return dest; }
The inefficiency can be avoided with
strnlen()
if available on the target system:char *safe_strcpy(char *dest, size_t size, char *src) { if (size > 0) { size_t len = strnlen(src, size - 1); memcpy(dest, src, len); dest[len] = '\0'; } return dest; }
You could use
strncpy()
and force null termination. This would be inefficient if the destination array is large becausestrncpy()
also fills the rest of the destination array with null bytes if the source string is shorter. This function's semantics are very counter-intuitive, poorly understood and error-prone. Even when used correctly, occurrences ofstrncpy()
are bugs waiting to bite, as the next programmer, bolder but less savvy, might alter the code and introduce them in an attempt to optimize code he does not fully understand. Play it safe: avoid this function.
Another aspect of this question is the ability for the caller to detect truncation. The above implementations of safe_strcpy
return the target pointer, as strcpy
does, hence do not provide any information to the caller. snprintf()
returns an int
representing the number of characters that would have been copied if the target array was large enough, in this case, the return value is strlen(src)
converted to int
, which allows the caller to detect truncation and other errors.
Here is another function more appropriate for composing a string from different parts:
size_t strcpy_at(char *dest, size_t size, size_t pos, const char *src) {
size_t len = strlen(src);
if (pos < size) {
size_t chunk = size - pos - 1;
if (chunk > len)
chunk = len;
memcpy(dest + pos, src, chunk);
dest[pos + chunk] = '\0';
}
return pos + len;
}
This function can be used in sequences without undefined behavior:
void say_hello(const char **names, size_t count) {
char buf[BUFSIZ];
char *p = buf;
size_t size = sizeof buf;
for (;;) {
size_t pos = strcpy_at(p, size, 0, "Hello");
for (size_t i = 0; i < count; i++) {
pos = strcpy_at(p, size, pos, " ");
pos = strcpy_at(p, size, pos, names[i]);
}
pos = strcpy_at(p, size, pos, "!");
if (pos >= size && p == buf) {
// allocate a larger buffer if required
p = malloc(size = pos + 1);
if (p != NULL)
continue;
p = buf;
}
printf("%s\n", p);
if (p != buf)
free(p);
break;
}
}
An equivalent approach for snprintf
would be useful too, passing pos
by address:
size_t snprintf_at(char *s, size_t n, size_t *ppos, const char *format, ...) {
va_list arg;
int ret;
size_t pos = *ppos;
if (pos < n) {
s += pos;
n -= pos;
} else {
s = NULL;
n = 0;
}
va_start(arg, format);
ret = snprintf(s, n, format, arg);
va_end(arg);
if (ret >= 0)
*ppos += ret;
return ret;
}
passing pos
by address instead of by value allows for snprintf_at
to return snprintf
's return value, which can be -1
in case of encoding error.