How to read the content of a file to a string in C?
I tend to just load the entire buffer as a raw memory chunk into memory and do the parsing on my own. That way I have best control over what the standard lib does on multiple platforms.
This is a stub I use for this. you may also want to check the error-codes for fseek, ftell and fread. (omitted for clarity).
char * buffer = 0;
long length;
FILE * f = fopen (filename, "rb");
if (f)
{
fseek (f, 0, SEEK_END);
length = ftell (f);
fseek (f, 0, SEEK_SET);
buffer = malloc (length);
if (buffer)
{
fread (buffer, 1, length, f);
}
fclose (f);
}
if (buffer)
{
// start to process your data / extract strings here...
}
If "read its contents into a string" means that the file does not contain characters with code 0, you can also use getdelim() function, that either accepts a block of memory and reallocates it if necessary, or just allocates the entire buffer for you, and reads the file into it until it encounters a specified delimiter or end of file. Just pass '\0' as the delimiter to read the entire file.
This function is available in the GNU C Library, http://www.gnu.org/software/libc/manual/html_mono/libc.html#index-getdelim-994
The sample code might look as simple as
char* buffer = NULL;
size_t len;
ssize_t bytes_read = getdelim( &buffer, &len, '\0', fp);
if ( bytes_read != -1) {
/* Success, now the entire file is in the buffer */
If you are reading special files like stdin or a pipe, you are not going to be able to use fstat to get the file size beforehand. Also, if you are reading a binary file fgets is going to lose the string size information because of embedded '\0' characters. Best way to read a file then is to use read and realloc:
#include <stdio.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>
int main () {
char buf[4096];
ssize_t n;
char *str = NULL;
size_t len = 0;
while (n = read(STDIN_FILENO, buf, sizeof buf)) {
if (n < 0) {
if (errno == EAGAIN)
continue;
perror("read");
break;
}
str = realloc(str, len + n + 1);
memcpy(str + len, buf, n);
len += n;
str[len] = '\0';
}
printf("%.*s\n", len, str);
return 0;
}
Another, unfortunately highly OS-dependent, solution is memory mapping the file. The benefits generally include performance of the read, and reduced memory use as the applications view and operating systems file cache can actually share the physical memory.
POSIX code would look like this:
int fd = open("filename", O_RDONLY);
int len = lseek(fd, 0, SEEK_END);
void *data = mmap(0, len, PROT_READ, MAP_PRIVATE, fd, 0);
Windows on the other hand is little more tricky, and unfortunately I don't have a compiler in front of me to test, but the functionality is provided by CreateFileMapping()
and MapViewOfFile()
.