Custom support for __attribute__((format))

With recent version of GCC (I recommend 4.7 or newer, but you could try with a GCC 4.6) you can add your own variables and functions attributes thru a GCC plugin (with the PLUGIN_ATTRIBUTES hook), or a MELT extension. MELT is a domain specific language to extend GCC (implemented as a [meta-]plugin).

If using a plugin (e.g. MELT) you won't need to recompile the source code of GCC. But you need a plugin-enabled GCC (check with gcc -v).

In 2020, MELT is not updated any more (because of lack of funding); however you could write your own GCC plugin for GCC 10 in C++, doing such checks.

Some Linux distributions don't enable plugins in their gcc - please complain to your distribution vendor; others provide a package for GCC plugin development, e.g. gcc-4.7-plugin-dev for Debian or Ubuntu.


It's doable, but it's certainly not easy; part of the problem is that BaseString and BaseObject are user-defined types, so you need to define the format specifiers dynamically. Fortunately gcc at least has support for this, but would still require patching the compiler.

The magic is in the handle_format_attribute function in gcc/c-family/c-format.c, which calls initialization functions for format specifiers that refer to user-defined types. A good example to base your support on would be the gcc_gfc format type, because it defines a format specifier %L for locus *:

/* This will require a "locus" at runtime.  */
{ "L",   0, STD_C89, { T89_V,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN  }, "", "R", NULL },

Obviously though you'd want to base your format_char_info array on print_char_table, as that defines the standard printf specifiers; gcc_gfc is substantially cut down in comparison.

The patch that added gcc_gfc is http://gcc.gnu.org/ml/fortran/2005-07/msg00018.html; it should be fairly obvious from that patch how and where you'd need to make your additions.


One year and a half after having asked this question, I came out with a totally different approach to solve the real problem: Is there any way to statically check the types of custom variadic formatting statements?

For completeness and because it can help other people, here is the solution I have finally implemented. It has two advantages over the original question:

  • Relatively simple : implemented in less than a day;
  • Compiler independent : can check C++ code on any platform (Windows, Android, OSX, ...).

A Perl script parses the source code, finds the formatting strings and decodes the percent modifiers inside them. It then wraps all arguments with a call to a template identity function CheckFormat<>. Example:

str->appendFormat("%hhu items (%.2f %%) from %S processed", 
    nbItems, 
    nbItems * 100. / totalItems, 
    subject);

Becomes:

str->appendFormat("%hhu items (%.2f %%) from %S processed", 
    CheckFormat<CFL::u, CFM::hh>(nbItems  ), 
    CheckFormat<CFL::f, CFM::_>(nbItems * 100. / totalItems  ), 
    CheckFormat<CFL::S, CFM::_, const BaseString*>(subject  ));

The enumerations CFL, CFM and the template function CheckFormat must be defined in a common header file like this (this is an extract, there are around 24 overloads).

enum class CFL
{
    c, d, i=d, star=i, u, o=u, x=u, X=u, f, F=f, e=f, E=f, g=f, G=f, p, s, S, P=S, at
};
enum class CFM
{
    hh, h, l, z, ll, L=ll, _
};
template<CFL letter, CFM modifier, typename T> inline T CheckFormat(T value) { CFL test= value; (void)test; return value; }
template<> inline const BaseString* CheckFormat<CFL::S, CFM::_, const BaseString*>(const BaseString* value) { return value; }
template<> inline const BaseObject* CheckFormat<CFL::at, CFM::_, const BaseObject*>(const BaseObject* value) { return value; }
template<> inline const char* CheckFormat<CFL::s, CFM::_, const char*>(const char* value) { return value; }
template<> inline const void* CheckFormat<CFL::p, CFM::_, const void*>(const void* value) { return value; }
template<> inline char CheckFormat<CFL::c, CFM::_, char>(char value) { return value; }
template<> inline double CheckFormat<CFL::f, CFM::_, double>(double value) { return value; }
template<> inline float CheckFormat<CFL::f, CFM::_, float>(float value) { return value; }
template<> inline int CheckFormat<CFL::d, CFM::_, int>(int value) { return value; }

...

After having the compilation errors, it is easy to recover the original form with a regular expression CheckFormat<[^<]*>\((.*?) \) replaced by its capture.

Tags:

C++

C

Printf

Gcc

Clang