How does a backslash-newline combo affect the value of the C preprocessor's __LINE__ macro?
Compilers implement __LINE__
by remembering physical line numbers in ways not specified by the C standard.
C 2018 6.10.8.1 1 tells us __LINE__
is replaced by “The presumed line number (within the current source file) of the current source line (an integer constant).” This specification is vague and cannot be implemented in a useful way while adhering to the standard literally.
Consider this code:
#define Assert(test) do { if (!test) printf("Assertion on line %d failed.\n", __LINE__); } while (0)
... Many lines of code follow, including some with line splicing.
Assert(condition);
... Many lines of code.
To be useful, this code must print the physical line number on which the Assert
is used. It needs to be the physical line number so that the user can locate the line in a text editor, and it needs to be the line on which the Assert
macro is replaced, not defined, because that is where the problem is detected. Both GCC and Clang do this.
However, this requires that the physical line number from before line splicing be provided during macro replacement, which occurs after line splicing. In C 2018 5.1.1.2 1, the standard specifies a translation model in which:
- in phase 2, “Each instance of a backslash character () followed immediately by a new-line character is deleted, splicing physical source lines to form logical source lines,” and,
- in phase 3, “The source file is decomposed into preprocessing tokens and white-space characters,” including new-line characters but not ones deleted in phase 2, and,
- in phase 4, macro invocations are expanded.
So, if a compiler replaces a __LINE__
macro in phase 4 and literally has only the preprocessing tokens and remaining white-space characters, it cannot know the physical line number to provide.
Therefore, a compiler cannot be implemented literally following the standard’s model of translation. To be useful, it must associate a physical line number with each preprocessing token that could be a macro name. Whenever a macro is replaced, it must propagate the associated physical line number. Then, when a __LINE__
token is finally replaced, the compiler will have the associated physical line number to replace it with.