How much do forward declarations affect compile time?
#include "myClass.h"
is 1..n lines
class myClass;
is 1 line.
You will save time unless all your headers are 1 liners. As there is no impact on the compilation itself (forward reference is just way to say to the compiler that a specific symbol will be defined at link time, and will be possible only if the compiler doesnt need data from that symbol (data size for example)), the reading time of the files included will be saved everytime you replace one by forward references. There's not a regular measure for this as it is a per project value, but it is a recommended practice for large c++ projects (See Large-Scale C++ Software Design / John Lakos for more info about tricks to manage large projects in c++ even if some of them are dated)
Another way to limit the time passed by the compiler on headers is pre-compiled headers.
I made a small demo which generates artificial codebase and tests this hypothesis.
It generates 200 headers. Each header has a struct with 100 fields and a comment 5000 bytes long. 500 .c
files are used for benchmarking, each includes all the header files or forward declares all the classes.
To make it more realistic, each header is also included into it's own .c
file
The result is that using includes took me 22 seconds to compile while using forward declarations took 9 seconds.
generate.py
#!/usr/bin/env python3
import random
import string
include_template = """#ifndef FILE_{0}_{1}
#define FILE_{0}_{1}
{2}
//{3}
struct c_{0}_{1} {{
{4}}};
#endif
"""
def write_file(name, content):
f = open("./src/" + name, "w")
f.write(content)
f.close()
GROUPS = 200
FILES_PER_GROUP = 0
EXTRA_SRC_FILES = 500
COMMENT = ''.join(random.choices(string.ascii_uppercase + string.digits, k=5000))
VAR_BLOCK = "".join(["int var_{0};\n".format(k) for k in range(100)])
main_includes = ""
main_fwd = ""
for i in range(GROUPS):
include_statements = ""
for j in range(FILES_PER_GROUP):
write_file("file_{0}_{1}.h".format(i,j), include_template.format(i, j, "", COMMENT, VAR_BLOCK))
write_file("file_{0}_{1}.c".format(i,j), "#include \"file_{0}_{1}.h\"\n".format(i,j))
include_statements += "#include \"file_{0}_{1}.h\"\n".format(i, j)
main_includes += "#include \"file_{0}_{1}.h\"\n".format(i,j)
main_fwd += "struct c_{0}_{1};\n".format(i,j)
write_file("file_{0}_x.h".format(i), include_template.format(i, "x", include_statements, COMMENT, VAR_BLOCK))
write_file("file_{0}_x.c".format(i), "#include \"file_{0}_x.h\"\n".format(i))
main_includes += "#include \"file_{0}_x.h\"\n".format(i)
main_fwd += "struct c_{0}_x;\n".format(i)
main_template = """
{0}
int main(void) {{ return 0; }}
"""
for i in range(EXTRA_SRC_FILES):
write_file("extra_inc_{0}.c".format(i), main_includes)
write_file("extra_fwd_{0}.c".format(i), main_fwd)
write_file("maininc.c", main_template.format(main_includes))
write_file("mainfwd.c", main_template.format(main_fwd))
run_test.sh
#!/bin/bash
mkdir -p src
./generate.py
ls src/ | wc -l
du -h src/
gcc -v
echo src/file_*_*.c src/extra_inc_*.c src/mainfwd.c | xargs time gcc -o fwd.out
rm -rf out/*.a
echo src/file_*_*.c src/extra_fwd_*.c src/maininc.c | xargs time gcc -o inc.out
rm -rf fwd.out inc.out src
Results
$ ./run_test.sh
1402
8.2M src/
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/c++/4.2.1
Apple clang version 11.0.3 (clang-1103.0.32.29)
Target: x86_64-apple-darwin19.3.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
22.32 real 13.56 user 8.27 sys
8.51 real 4.44 user 3.78 sys
Forward declarations can make for neater more understandable code which HAS to be the goal of any decision surely.
Couple that with the fact that when it comes to classes its quite possible for 2 classes to rely upon each other which makes it a bit hard to NOT use forward declaration without causing a nightmare.
Equally forward declaration of classes in a header means that you only need to include the relevant headers in the CPPs that actually USE those classes. That actually DECREASES compile time.
Edit: Given your comment above I would point out it is ALWAYS slower to include a header file than to forward declare. Any time you include a header you are necessitating a load from disk often only to find out that the header guards mean that nothing happens. That would waste immense amounts of time and is really a VERY stupid rule to be bringing in.
Edit 2: Hard data is pretty hard to obtain. Anecdotally, I once worked on a project that wasn't strict about its header includes and the build time was roughly 45 minute on a 512MB RAM P3-500Mhz (This was a while back). After spending 2 weeks cutting down the include nightmare (By using forward declarations) I had managed to get the code to build in a little under 4 minutes. Subsequently using forward declarations became a rule whenever possible.
Edit 3: Its also worth bearing in mind that there is a huge advantage from using forward declarations when it comes to making small modifications to your code. If headers are included all over the shop then a modification to a header file can cause vast amounts of files to be rebuilt.
I also note lots of other people extolling the virtues of pre-compiled headers (PCHs). They have their place and they can really help but they really shouldn't be used as an alternative to proper forward declaration. Otherwise modifications to header files can cause issues with recompilation of lots of files (as mentioned above) as well as triggering a PCH rebuild. PCHs can provide a big win for things like libraries that are pre-built but they are no reason not to use proper forward declarations.
Have a look in John Lakos's excellent Large Scale C++ Design book -- I think he has some figures for forward declaration by looking at what happens if you include N headers M levels deep.
If you don't use forward declarations, then aside from increasing the total build time from a clean source tree, it also vastly increases the incremental build time because header files are being included unnecessarily. Say you have 4 classes, A, B, C and D. C uses A and B in its implementation (ie in C.cpp
) and D uses C in its implementation. The interface of D is forced to include C.h because of this 'no forward declaration' rule. Similarly C.h is forced to include A.h and B.h, so whenever A or B is changed, D.cpp has to be rebuilt even though it has no direct dependency. As the project scales up this means that if you touch any header it'll have a massive effect on causing huge amounts of code to be rebuilt that just doesn't need to be.
To have a rule that disallows forward declaration is (in my book) very bad practice indeed. It's going to waste huge amounts of time for the developers for no gain. The general rule of thumb should be that if the interface of class B depends on class A then it should include A.h, otherwise forward declare it. In practice 'depends on' means inherits from, uses as a member variable or 'uses any methods of'. The Pimpl idiom is a widespread and well understood method for hiding the implementation from the interface and allows you to vastly reduce the amount of rebuilding needed in your codebase.
If you can't find the figures from Lakos then I would suggest creating your own experiments and taking timings to prove to your management that this rule is absolutely wrong-headed.