What is the better way to check EOF and error of fgetc()?

I usually program input loops like this:

int c;

while (c = fgetc(fp), c != EOF) {
    /* do something with c here */
}

/* check if EOF came from an end-of-file or an error */
if (ferror(fp)) {
    /* error handling here */
}

You should generally not use a loop condition like this:

while (!feof(fp)) {
    /* do stuff */
}

or

for (;;) {
    c = fgetc(fp);
    if (feof(fp))
        break;
}

Because this breaks when an IO error is encountered. In this case, fgetc returns EOF but the end-of-file flag is not set. Your code could enter an infinite loop as an error condition usually persists until external action is taken.

The proper way is to check the result of fgetc(): If it's equal to EOF, you can usually stop reading further data as both in case of an IO error and an end-of-file condition, it's usually not possible to read further data. You should then check if an error occurred and take appropriate action.


2 Interesting Issues

ferror()

ferror() reflects the state of the error indicator for the stream. This flag is set when a rare input error occurs and remains set until cleared - see clearerr(). If a read input errors and code later reads again, without clearing, ferror() still reports true, even if the following read is not in error.

When fgetc() returns EOF it could be due to an end-of-file (common) or the rare input error. Better to check feof() than ferror() to distinguish. ferror() could be true due a prior error and not the present case - which is certainly end-of-file.

int c;
c = fgetc(file);
if (c == EOF) {
  if (feof(file)) puts("end-of-file");
  else puts("input error");
}

Wide char: The issue of testing for an error condition came up because of a corner case in C.

fgetc() returns an int. Its values are in the range of unsigned char and EOF, (some negative number).

int ch;
while ((ch = fgetc(fp)) != EOF) {
  // do something with ch
}
if (ferror(fp)) Handle_InputError();
if (feof(fp)) Handle_EndOffFile();  // Usually nothing special

Yet C allows unsigned char to have a wider range than the positive number of int. The converting an unsigned char to int has an implementation defined behavior which may result in an unsigned char value being converted to a negative int - and one that matches EOF.

Such platforms are rare and not in the main-stream of 2015. Most will have UCHAR_MAX <= INT_MAX and the above style is typically used. Doubtful these platforms will ever become common due to the amount of code, like the above, that relies on EOF being distinct from unsigned charconverted to int.

Should code need to handle the rare case where UCHAR_MAX > INT_MAX, then

int c;
for (;;)
{
    c = fgetc(file);
    if (c == EOF) {
      if (feof(file)) break;
      if (ferror(file)) break;
      // fall through if both if's fail.
    }
    // do stuff with c
}

The popular reference in while ( !feof (file) ) always wrong? highlights the mistake code often makes in using the results of fgetc(in) before checking for problems. Both codes above check for error conditions before using the result of fgetc().


The 2nd code handles all situations including ones that may only apply to a computer sitting in a some long forgotten rubbish heap. The first is far more common.

Tags:

C

Fgetc