How to reduce compile time with C++ templates

I think the general rules apply. Try to reduce coupling between parts of the code. Break up too large template headers into smaller groups of functions used together, so the whole thing won't have to be included in each and every source file.

Also, try to get the headers into a stable state fast, perhaps testing them out against a smaller test program, so they wouldn't need changing (too much) when integrated into a larger program.

(As with any optimization, it might be less worth to optimize for the compiler's speed when dealing with templates, rather than finding an "algorithmic" optimization that reduces the work-load drastically in the first place.)


Several approaches:

  • The export keyword could theoretically help, but it was poorly supported and was officially removed in C++11.
  • Explicit template instantiation (see here or here) is the most straightforward approach, if you can predict ahead of time which instantiations you'll need (and if you don't mind maintaining this list).
  • Extern templates, which are already supported by several compilers as extensions. It's my understanding that extern templates don't necessarily let you move the template definitions out of the header file, but they do make compiling and linking faster (by reducing the number of times that template code must be instantiated and linked).
  • Depending on your template design, you may be able to move most of its complexity into a .cpp file. The standard example is a type-safe vector template class that merely wraps a type-unsafe vector of void*; all of the complexity goes in the void* vector that resides in a .cpp file. Scott Meyers gives a more detailed example in Effective C++ (item 42, "Use private inheritance judiciously", in the 2nd edition).

First of all, for completeness, I'll cover the straightforward solution: only use templated code when necessary, and base it on non-template code (with implementation in its own source file).

However, I suspect that the real issue is that you use generic programming as you would use typical OO-programming and end up with a bloated class.

Let's take an example:

// "bigArray/bigArray.hpp"

template <class T, class Allocator>
class BigArray
{
public:
  size_t size() const;

  T& operator[](size_t index);
  T const& operator[](size_t index) const;

  T& at(size_t index);
  T const& at(size_t index);

private:
  // impl
};

Does this shock you ? Probably not. It seems pretty minimalist after all. The thing is, it's not. The at methods can be factored out without any loss of generality:

// "bigArray/at.hpp"

template <class Container>
typename Container::reference_type at(Container& container,
                                      typename Container::size_type index)
{
  if (index >= container.size()) throw std::out_of_range();
  return container[index];
}

template <class Container>
typename Container::const_reference_type at(Container const& container,
                                            typename Container::size_type index)
{
  if (index >= container.size()) throw std::out_of_range();
  return container[index];
}

Okay, this changes the invocation slightly:

// From
myArray.at(i).method();

// To
at(myArray,i).method();

However, thanks to Koenig's lookup, you can call them unqualified as long as you put them in the same namespace, so it's just a matter of habit.

The example is contrived but the general point stands. Note that because of its genericity at.hpp never had to include bigArray.hpp and will still produce as tight code as if it were a member method, it's just that we can invoke it on other containers if we wish.

And now, a user of BigArray does not need to include at.hpp if she does not uses it... thus reducing her dependencies and not being impacted if you change the code in that file: for example alter std::out_of_range call to feature the file name and line number, the address of the container, its size and the index we tried to access.

The other (not so obvious) advantage, is that if ever integrity constraint of BigArray is violated, then at is obviously out of cause since it cannot mess with the internals of the class, thus reducing the number of suspects.

This is recommended by many authors, such as Herb Sutters in C++ Coding Standards:

Item 44: Prefer writing nonmember nonfriend functions

and has been extensively used in Boost... But you do have to change your coding habits!

Then of course you need to only include what you do depend on, there ought to be static C++ code analyzers that report included but unused header files which can help figuring this out.


  • You can use explicit instantiation; however, only the template types you instantiate will compile ahead of time.

  • You might be able to take advantage of c++20's modules.

  • If you can factor out the templated types from your algorithm, you can put it in its own .cc file.

  • I wouldn't suggest this unless it's a major problem but: you may be able to provide a template container interface that is implemented with calls to a void* implementation that you are free to change at will.

  • Before c++11 you could use a compiler that supports the export keyword.