Custom support for __attribute__((format))
With recent version of GCC (I recommend 4.7 or newer, but you could try with a GCC 4.6) you can add your own variables and functions attributes thru a GCC plugin (with the PLUGIN_ATTRIBUTES
hook), or a MELT extension.
MELT is a domain specific language to extend GCC (implemented as a [meta-]plugin).
If using a plugin (e.g. MELT) you won't need to recompile the source code of GCC. But you need a plugin-enabled GCC (check with gcc -v
).
In 2020, MELT is not updated any more (because of lack of funding); however you could write your own GCC plugin for GCC 10 in C++, doing such checks.
Some Linux distributions don't enable plugins in their gcc
- please complain to your distribution vendor; others provide a package for GCC plugin development, e.g. gcc-4.7-plugin-dev
for Debian or Ubuntu.
It's doable, but it's certainly not easy; part of the problem is that BaseString
and BaseObject
are user-defined types, so you need to define the format specifiers dynamically. Fortunately gcc at least has support for this, but would still require patching the compiler.
The magic is in the handle_format_attribute
function in gcc/c-family/c-format.c
, which calls initialization functions for format specifiers that refer to user-defined types. A good example to base your support on would be the gcc_gfc
format type, because it defines a format specifier %L
for locus *
:
/* This will require a "locus" at runtime. */
{ "L", 0, STD_C89, { T89_V, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN, BADLEN }, "", "R", NULL },
Obviously though you'd want to base your format_char_info
array on print_char_table
, as that defines the standard printf
specifiers; gcc_gfc
is substantially cut down in comparison.
The patch that added gcc_gfc
is http://gcc.gnu.org/ml/fortran/2005-07/msg00018.html; it should be fairly obvious from that patch how and where you'd need to make your additions.
One year and a half after having asked this question, I came out with a totally different approach to solve the real problem: Is there any way to statically check the types of custom variadic formatting statements?
For completeness and because it can help other people, here is the solution I have finally implemented. It has two advantages over the original question:
- Relatively simple : implemented in less than a day;
- Compiler independent : can check C++ code on any platform (Windows, Android, OSX, ...).
A Perl script parses the source code, finds the formatting strings and decodes the percent modifiers inside them. It then wraps all arguments with a call to a template identity function CheckFormat<>
. Example:
str->appendFormat("%hhu items (%.2f %%) from %S processed",
nbItems,
nbItems * 100. / totalItems,
subject);
Becomes:
str->appendFormat("%hhu items (%.2f %%) from %S processed",
CheckFormat<CFL::u, CFM::hh>(nbItems ),
CheckFormat<CFL::f, CFM::_>(nbItems * 100. / totalItems ),
CheckFormat<CFL::S, CFM::_, const BaseString*>(subject ));
The enumerations CFL
, CFM
and the template function CheckFormat
must be defined in a common header file like this (this is an extract, there are around 24 overloads).
enum class CFL
{
c, d, i=d, star=i, u, o=u, x=u, X=u, f, F=f, e=f, E=f, g=f, G=f, p, s, S, P=S, at
};
enum class CFM
{
hh, h, l, z, ll, L=ll, _
};
template<CFL letter, CFM modifier, typename T> inline T CheckFormat(T value) { CFL test= value; (void)test; return value; }
template<> inline const BaseString* CheckFormat<CFL::S, CFM::_, const BaseString*>(const BaseString* value) { return value; }
template<> inline const BaseObject* CheckFormat<CFL::at, CFM::_, const BaseObject*>(const BaseObject* value) { return value; }
template<> inline const char* CheckFormat<CFL::s, CFM::_, const char*>(const char* value) { return value; }
template<> inline const void* CheckFormat<CFL::p, CFM::_, const void*>(const void* value) { return value; }
template<> inline char CheckFormat<CFL::c, CFM::_, char>(char value) { return value; }
template<> inline double CheckFormat<CFL::f, CFM::_, double>(double value) { return value; }
template<> inline float CheckFormat<CFL::f, CFM::_, float>(float value) { return value; }
template<> inline int CheckFormat<CFL::d, CFM::_, int>(int value) { return value; }
...
After having the compilation errors, it is easy to recover the original form with a regular expression CheckFormat<[^<]*>\((.*?) \)
replaced by its capture.