What does %[^\n] mean in C?

scanf("%[^\n]",line);

means: scan till \n or an enter key.


scanf("%[^\n]",line); is a problematic way to read a line. It is worse than gets().

C defines line as:

A text stream is an ordered sequence of characters composed into lines, each line consisting of zero or more characters plus a terminating new-line character. Whether the last line requires a terminating new-line character is implementation-defined.

The scanf("%[^\n]", line) has the specifier "%[^\n]". It scans for unlimited number of characters that match the scan-set ^\n. If none are read, the specifier fails and scanf() returns with line unaltered. If at least one character is read, all matching characters are read and saved and a null character is appended.

The scan-set ^\n implies all character that are not (due to the '^') '\n'.


'\n' is not read

scanf("%[^\n]",.... fails to read a new line character '\n'. It remains in stdin. The entire line is not read.

Buffer overflow

The below leads to undefined behavior (UB) should more than 99 characters get read.

char line[100];
scanf("%[^\n]",line);  // buffer overflow possible

Does nothing on empty line

When the line consists of only "\n", scanf("%[^\n]",line); returns a 0 without setting line[] - no null character is appended. This can readily lead to undefined behavior should subsequent code use an uninitialized line[]. The '\n' remains in stdin.

Failure to check the return value

scanf("%[^\n]",line); assumes input succeeded. Better code would check the scanf() return value.


Recommendation

Do not use scanf() and instead use fgets() to read a line of input.

#define EXPECTED_INPUT_LENGTH_MAX 49
char line[EXPECTED_INPUT_LENGTH_MAX + 1  + 1  + 1];
//                                    \n + \0 + extra to detect overly long lines 

if (fgets(line, sizeof line, stdin)) {
  size_t len = strlen(line);
  // Lop off potential trailing \n if desired.
  if (len > 0 && line[len-1] == '\n') {
    line[--len] = '\0';
  }
  if (len > EXPECTED_INPUT_LENGTH_MAX) {
    // Handle error
    // Usually includes reading rest of line if \n not found.
  }

The fgets() approach has it limitations too. e.g. (reading embedded null characters).

Handling user input, possible hostile, is challenging.


[^\n] is a kind of regular expression.

  • [...]: it matches a nonempty sequence of characters from the scanset (a set of characters given by ...).
  • ^ means that the scanset is "negated": it is given by its complement.
  • ^\n: the scanset is all characters except \n.

Furthermore fscanf (and scanf) will read the longest sequence of input characters matching the format.

So scanf("%[^\n]", s); will read all characters until you reach \n (or EOF) and put them in s. It is a common idiom to read a whole line in C.

See also §7.21.6.2 The fscanf function.

Tags:

C

String