Why are trigraphs generating errors in modern C++ compilers?
Trigraphs were introduced by the 1989 ANSI C standard, and are retained in all later C standards. They also appear in the first ISO C++ standard, published in 1998, and in all later C++ standards up to and including C++14. (Trigraphs were removed in C++17. Thanks to Jonathan Leffler and dyp for tracking down the details.)
Quoting a draft of the C++17 standard:
Effect on original feature: Valid C ++ 2014 code that uses trigraphs may not be valid or may have different semantics in this International Standard. Implementations may choose to translate trigraphs as specified in C ++ 2014 if they appear outside of a raw string literal, as part of the implementation-defined mapping from physical source file characters to the basic source character set.
They are not an optional feature in either language (prior to C++17); all conforming compilers must support them and interpret them as specified by the respective language standard.
For example, if this program:
#include <stdio.h>
int main(void) {
if ('|' == '??!') {
puts("ok");
}
else {
puts("oops");
}
return 0;
}
prints oops
, then your compiler is non-conforming.
But many, perhaps most, C compilers are not fully conforming by default. As long as a compiler can be made to conform to the standard in some way, that's good enough as far as the standard is concerned. (gcc requires -pedantic
and -std=...
to do this.)
But even if a compiler is fully conforming, there's nothing in the standard that forbids a compiler from warning about anything it likes. A conforming C compiler must diagnose any violation of a syntax rule or constraint, but it can issue as many additional warnings as it likes -- and it needn't distinguish between required diagnostics and other warnings.
Trigraphs are very rarely used. The vast majority of development systems support directly all the characters for which trigraphs substitute: #
, [
, \
, ]
, ^
, {
, |
, }
, ~
.
In fact, it's likely that trigraphs are used accidentally more often than they're used correctly:
fprintf(stderr, "What just happened here??!\n");
Warning about trigraphs that might alter the meaning of a program (relative to the meaning it would have if the language didn't have trigraphs) is both permitted by the ISO standard and IMHO perfectly reasonable. Most compilers probably have options to turn off such warnings.
Conversely, for a C++17 compiler that doesn't implement trigraphs, it would be reasonable to warn about sequences that would have been treated as trigraphs in C++14 or earlier, and/or to provide an option to support trigraphs. Again, an option to disable such warnings would be a good thing.
GCC is allergic to trigraphs. You have to explicitly enable them:
gcc -trigraphs ...
The GCC 4.7.1 manual says:
-trigraphs
Support ISO C trigraphs. The
-ansi
option (and-std
options for strict ISO C conformance) implies-trigraphs
.
It also says:
-Wtrigraphs
Warn if any trigraphs are encountered that might change the meaning of the program (trigraphs within comments are not warned about). This warning is enabled by
-Wall
.
They might be turned off by default.
"Some compilers support an option to turn recognition of trigraphs off, or disable trigraphs by default and require an option to turn them on"
GCC might be one of the latter. Although it should by default ignore with warning, but in this case the ignoring might be causing the compile error