Is fgets() returning NULL with a short buffer compliant?

The C Standard (C11 n1570 draft) specifies fgets() this way (some emphasis mine):

7.21.7.2 The fgets function

Synopsis

   #include <stdio.h>
   char *fgets(char * restrict s, int n,
               FILE * restrict stream);

Description

The fgets function reads at most one less than the number of characters specified by n from the stream pointed to by stream into the array pointed to by s. No additional characters are read after a new-line character (which is retained) or after end-of-file. A null character is written immediately after the last character read into the array.

Returns

The fgets function returns s if successful. If end-of-file is encountered and no characters have been read into the array, the contents of the array remain unchanged and a null pointer is returned. If a read error occurs during the operation, the array contents are indeterminate and a null pointer is returned.

The phrase reads at most one less than the number of characters specified by n is not precise enough. A negative number cannot represent a number of characters*, but 0 does mean no characters. reading at most -1 characters does not seem possible, so the case of n <= 0 is not defined by the Standard, and as such has undefined behavior.

For n = 1, fgets is specified as reading at most 0 characters, which it should succeed at unless the stream is invalid or in an error condition. The phrase A null character is written immediately after the last character read into the array is ambiguous as no characters have been read into the array, but it makes sense to interpret this special case as meaning s[0] = '\0';. The specification for gets_s offers the same reading, with the same imprecision. Again the behavior is not explicitly defined, so it is undefined1.

The specification of snprintf is more precise, the case of n = 0 is explicitly specified, with useful semantics attached. Unfortunately, such semantics cannot be implemented for fgets:

7.21.6.5 The snprintf function

Synopsis

#include <stdio.h>
int snprintf(char * restrict s, size_t n,
     const char * restrict format, ...);

Description

The snprintf function is equivalent to fprintf, except that the output is written into an array (specified by argument s) rather than to a stream. If n is zero, nothing is written, and s may be a null pointer. Otherwise, output characters beyond the n-1st are discarded rather than being written to the array, and a null character is written at the end of the characters actually written into the array. If copying takes place between objects that overlap, the behavior is undefined.

The specification for get_s() also clarifies the case of n = 0 and makes it a runtime constraint violation:

K.3.5.4.1 The gets_s function

Synopsis

#define __STDC_WANT_LIB_EXT1__ 1
#include <stdio.h>
char *gets_s(char *s, rsize_t n);

Runtime-constraints

s shall not be a null pointer. n shall neither be equal to zero nor be greater than RSIZE_MAX. A new-line character, end-of-file, or read error shall occur within reading n-1 characters from stdin.

If there is a runtime-constraint violation, s[0] is set to the null character, and characters are read and discarded from stdin until a new-line character is read, or end-of-file or a read error occurs.

Description

The gets_s function reads at most one less than the number of characters specified by n from the stream pointed to by stdin, into the array pointed to by s. No additional characters are read after a new-line character (which is discarded) or after end-of-file. The discarded new-line character does not count towards number of characters read. A null character is written immediately after the last character read into the array.

If end-of-file is encountered and no characters have been read into the array, or if a read error occurs during the operation, then s[0] is set to the null character, and the other elements of s take unspecified values.

Recommended practice

The fgets function allows properly-written programs to safely process input lines too long to store in the result array. In general this requires that callers of fgets pay attention to the presence or absence of a new-line character in the result array. Consider using fgets (along with any needed processing based on new-line characters) instead of gets_s.

Returns

The gets_s function returns s if successful. If there was a runtime-constraint violation, or if end-of-file is encountered and no characters have been read into the array, or if a read error occurs during the operation, then a null pointer is returned.

The C library you are testing seems to have a bug for this case, which was fixed un later versions of the glibc. Returning NULL should mean some kind of failure condition (the opposite of success): end-of-file or read-error. Other cases such as invalid stream or stream not open for reading are more or less explicitly described as undefined behavior.

The cases of n = 0 and n < 0 are not defined. Returning NULL is a sensible choice, but it would be useful to clarify the description of fgets() in the Standard to require n > 0 as is the case for gets_s.

Note that there is another specification issue for fgets: the type of the n argument should have been size_t instead of int, but this function was originally specified by the C authors before size_t was even invented, and kept unchanged in the first C Standard (C89). Changing it then was considered unacceptable because they were trying to standardize existing usage: the signature change would have created inconsistencies across C libraries and broken well written existing code that uses function pointers or unprototyped functions.


1The C Standard specifies in paragraph 2 of 4. Conformance that If a “shall” or “shall not” requirement that appears outside of a constraint or runtime-constraint is violated, the behavior is undefined. Undefined behavior is otherwise indicated in this International Standard by the words “undefined behavior” or by the omission of any explicit definition of behavior. There is no difference in emphasis among these three; they all describe “behavior that is undefined”.


tl;dr: that version of glibc has a bug for n=1, the spec has (arguably) an ambiguity for n<1; but I think newer glibc's take the most sensible option.

So, the c99 spec is basically the same.

The behavior for test_fgets(s, 1) is wrong. glibc 2.19 gives the correct output (retval!=null, s[0]==null.

The behavior for test_fgets(s,0) is undefined, really. It wasn't successful (you can't read at most -1 characters), but it doesn't hit either of the two 'return null' criteria (EOF& 0 read; read error).

However, GCC's behavior is arguably correct (returning the pointer to the unchanged s would also be OK) - feof isn't set, because it hasn't hit eof; ferror isn't set because there wasn't a read error.

I suspect the logic in gcc (not got the source to hand) has an 'if n<=0 return null' near the top.

[edit:]

On reflection, I actually think that glibc's behavior for n=0 is the most correct response it could give:

  • No eof read, so feof()==0
  • No reads, so no read error could have happened, so ferror=0

Now as for the return value - fgets cannot have read -1 characters (it's impossible). If fgets returned the passed in pointer back, it would look like a successful call. - Ignoring this corner case, fgets commits to returning a null-terminated string. If it didn't in this case, you couldn't rely on it. But fgets will set the character after after the last character read into the array to null. given we read in -1 characters (apparantly) on this call, that would make it setting the 0th character to null?

So, the sanest choice is to return null (in my opinion).


The behavior is different in newer releases of glibc, for n == 1, it returns s which indicates success, this is not an unreasonable reading of 7.19.7.2 The fgets function paragraph 2 which says (it is the same in both C99 and C11, emphasis mine):

char *fgets(char * restrict s, int n, FILE * restrict stream);

The fgets function reads at most one less than the number of characters specified by n from the stream pointed to by stream into the array pointed to by s. No additional characters are read after a new-line character (which is retained) or after end-of-file. A null character is written immediately after the last character read into the array.

Not terribly useful but does not violate anything said in the standard, it will read at most 0 characters and null-terminate. So the results you are seeing looks like a bug that was fixed in later releases of glibc. It also clearly not an end of file nor a read error as covered in paragraph 3:

[...]If end-of-file is encountered and no characters have been read into the array, the contents of the array remain unchanged and a null pointer is returned. If a read error occurs during the operation, the array contents are indeterminate and a null pointer is returned.

As far as the final case where n == 0 this looks like it is simply undefined behavior. The draft C99 standard section 4. Conformance paragraph 2 says (emphasis mine):

If a ‘‘shall’’ or ‘‘shall not’’ requirement that appears outside of a constraint is violated, the behavior is undefined. Undefined behavior is otherwise indicated in this International Standard by the words ‘‘undefined behavior’’ or by the omission of any explicit definition of behavior. There is no difference in emphasis among these three; they all describe ‘‘behavior that is undefined’’.

The wording is the same in C11. It is impossible to read at most -1 characters and it is neither an end of file nor a read error. So we have no explicit definition of the behavior in this case. Looks like a defect but I cannot find any defect reports that cover this.