Self-sufficient header files in C/C++

Old question, new answer. :-)

There is now a tool called include-what-you-use which is designed to analyse your code for exactly this kind of problem. On Debian and derived systems, it can be installed as the iwyu package.


A self sufficient header file is one that doesn't depend on the context of where it is included to work correctly. If you make sure you #include or define/declare everything before you use it, you have yourself a self sufficient header.
An example of a non self sufficient header might be something like this:

----- MyClass.h -----

class MyClass
{
   MyClass(std::string s);
};

-

---- MyClass.cpp -----

#include <string>
#include "MyClass.h"

MyClass::MyClass(std::string s)
{}

In this example, MyClass.h uses std::string without first #including . For this to work, in MyClass.cpp you need to put the #include <string> before #include "MyClass.h".
If MyClass's user fails to do this he will get an error that std::string is not included.

Maintaining your headers to be self sufficient can be often neglected. For instance, you have a huge MyClass header and you add to it another small method which uses std::string. In all of the places this class is currently used, is already #included before MyClass.h. then someday you #include MyClass.h as the first header and suddenly you have all these new error in a file you didn't even touch (MyClass.h)
Carefully maintaining your headers to be self sufficient help to avoid this problem.


Make sure you include everything you need in the header, instead of assuming that something you included includes something else you need.


NASA's Goddard Space Flight Center (GSFC) has published C and C++ programming standards that address this issue.

Assume you have a module with a source file perverse.c and its header perverse.h.

Ensuring a header is self-contained

There is a very simple way to ensure that a header is self-contained. In the source file, the first header you include is the module's header. If it compiles like this, the header is self-contained (self-sufficient). If it does not, fix the header until it is (reliably1) self-contained.

perverse.h

#ifndef PERVERSE_H_INCLUDED
#define PERVERSE_H_INCLUDED

#include <stddef.h>

extern size_t perverse(const unsigned char *bytes, size_t nbytes);

#endif /* PERVERSE_H_INCLUDED */

Almost all headers should be protected against multiple inclusion. (The standard <assert.h> header is an explicit exception to the rule — hence the 'almost' qualifier.)

perverse.c

#include "perverse.h"
#include <stdio.h>   // defines size_t too

size_t perverse(const unsigned char *bytes, size_t nbytes)
{
    ...etc...
}

Note that even though it was traditionally considered a good idea to include the standard headers before the project headers, in this case, it is crucial to the testability that the module header (perverse.h) comes before all others. The only exception I'd allow is including a configuration header ahead of the module header; however, even that is dubious. If the module header needs to use (or maybe just 'can use') the information from the configuration header, it should probably include the configuration header itself, rather than rely on the source files using it to do so. However, if you need to configure which version of POSIX to request support for, that must be done before the first system header is included.


Footnote 1: Steve Jessop's comment to Shoosh's answer is why I put the parenthesized '(reliably)' comment into my 'fix it' comment. He said:

Another factor making this difficult is the "system headers can include other headers" rule in C++. If <iostream> includes <string>, then it's quite difficult to discover that you've forgotten to include <string> in some header which does [not] use <iostream> [or <string>]. Compiling the header on its own gives no errors: it's self-sufficient on this version of your compiler, but on another compiler it might not work.

See also the answer by Toby Speight about IWYU — Include What You Use.


Appendix: Matching these rules with GCC Precompiled Headers

The GCC rules for precompiled headers permit just one such header per translation unit, and it must appear before any C tokens.

GCC 4.4.1 Manual, §3.20 Using Precompiled Headers

A precompiled header file can be used only when these conditions apply:

  • Only one precompiled header can be used in a particular compilation.
  • A precompiled header can’t be used once the first C token is seen. You can have preprocessor directives before a precompiled header; you can even include a precompiled header from inside another header, so long as there are no C tokens before the #include.
  • [...]
  • Any macros defined before the precompiled header is included must either be defined in the same way as when the precompiled header was generated, or must not affect the precompiled header, which usually means that they don’t appear in the precompiled header at all.

To a first approximation, these constraints mean that the precompiled header must be the first in the file. A second approximation notes that if 'config.h' only contains #define statements, it could appear ahead of the precompiled header, but it is much more likely that (a) the defines from config.h affect the rest of the code, and (b) the precompiled header needs to include config.h anyway.

The projects I work on are not set up to use pre-compiled headers, and the constraints defined by GCC plus the anarchy induced by over 20 years of intensive maintenance and extension by a diverse population of coders mean it would be very hard to add them.

Given the divergent requirements between the GSFC guidelines and GCC precompiled headers (and assuming that precompiled headers are in use), I think that I would ensure the self-containment and idempotence of headers using a separate mechanism. I already do this for the main projects I work on — reorganizing the headers to meet the GSFC guidelines is not an easy option — and the script I use is chkhdr, shown below. You could even do this as a 'build' step in the header directory — ensure that all the headers are self-contained as a 'compilation' rule.

chkhdr script

I use this chkhdr script to check that headers are self-contained. Although the shebang says 'Korn shell', the code is actually OK with Bash or even the original (System V-ish) Bourne Shell.

#!/bin/ksh
#
# @(#)$Id: chkhdr.sh,v 1.2 2010/04/24 16:52:59 jleffler Exp $
#
# Check whether a header can be compiled standalone

tmp=chkhdr-$$
trap 'rm -f $tmp.?; exit 1' 0 1 2 3 13 15

cat >$tmp.c <<EOF
#include HEADER /* Check self-containment */
#include HEADER /* Check idempotency */
int main(void){return 0;}
EOF

options=
for file in "$@"
do
    case "$file" in
    (-*)    options="$options $file";;
    (*)     echo "$file:"
            gcc $options -DHEADER="\"$file\"" -c $tmp.c
            ;;
    esac
done

rm -f $tmp.?
trap 0

It so happens that I've never needed to pass any options containing spaces to the script so the code is not sound in its handling of options of spaces. Handling them in Bourne/Korn shell at least makes the script more complex for no benefit; using Bash and an array might be better.

Usage:

chkhdr -Wstrict-prototypes -DULTRA_TURBO -I$PROJECT/include header1.h header2.h

GSFC Standard available via Internet Archive

The URL linked above is no longer functional (404). You can find the C++ standard (582-2003-004) at EverySpec.com (on page 2); the C standard (582-2000-005) seems to be missing in action.

However, the referenced NASA C coding standard can be accessed and downloaded via the Internet archive:

http://web.archive.org/web/20090412090730/http://software.gsfc.nasa.gov/assetsbytype.cfm?TypeAsset=Standard

See also:

  • Should I use #include in headers?
  • How to link multiple implementation files in C?
  • Professional #include contents?
  • Where to document functions in C or C++?