Explicit template instantiation - when is it used?

If you define a template class that you only want to work for a couple of explicit types.

Put the template declaration in the header file just like a normal class.

Put the template definition in a source file just like a normal class.

Then, at the end of the source file, explicitly instantiate only the version you want to be available.

Silly example:

// StringAdapter.h
template<typename T>
class StringAdapter
{
     public:
         StringAdapter(T* data);
         void doAdapterStuff();
     private:
         std::basic_string<T> m_data;
};
typedef StringAdapter<char>    StrAdapter;
typedef StringAdapter<wchar_t> WStrAdapter;

Source:

// StringAdapter.cpp
#include "StringAdapter.h"

template<typename T>
StringAdapter<T>::StringAdapter(T* data)
    :m_data(data)
{}

template<typename T>
void StringAdapter<T>::doAdapterStuff()
{
    /* Manipulate a string */
}

// Explicitly instantiate only the classes you want to be defined.
// In this case I only want the template to work with characters but
// I want to support both char and wchar_t with the same code.
template class StringAdapter<char>;
template class StringAdapter<wchar_t>;

Main

#include "StringAdapter.h"

// Note: Main can not see the definition of the template from here (just the declaration)
//       So it relies on the explicit instantiation to make sure it links.
int main()
{
  StrAdapter  x("hi There");
  x.doAdapterStuff();
}

Directly copied from https://docs.microsoft.com/en-us/cpp/cpp/explicit-instantiation:

You can use explicit instantiation to create an instantiation of a templated class or function without actually using it in your code. Because this is useful when you are creating library (.lib) files that use templates for distribution, uninstantiated template definitions are not put into object (.obj) files.

(For instance, libstdc++ contains the explicit instantiation of std::basic_string<char,char_traits<char>,allocator<char> > (which is std::string) so every time you use functions of std::string, the same function code doesn't need to be copied to objects. The compiler only need to refer (link) those to libstdc++.)


Explicit instantiation allows reducing compile times and output sizes

These are the major gains it can provide. They come from the following two effects described in detail in the sections below:

  • remove definitions from headers to prevent intelligent build systems from rebuilding includers on every change to those templates (saves time)
  • prevent object redefinition (saves time and size)

Remove definitions from headers

Explicit instantiation allows you to leave definitions in the .cpp file.

When the definition is on the header and you modify it, an intelligent build system would recompile all includers, which could be dozens of files, possibly making incremental re-compilation after a single file change unbearably slow.

Putting definitions in .cpp files does have the downside that external libraries can't reuse the template with their own new classes, but "Remove definitions from included headers but also expose templates an external API" below shows a workaround.

See concrete examples below.

Examples of build systems that detect includes and rebuild:

  • CMake: Handling header files dependencies with cmake
  • SCons: https://scons.org/doc/0.97/HTML/scons-man.html
  • Make + some manual GCC work: generate dependencies for a makefile for a project in C/C++

Object redefinition gains: understanding the problem

If you just completely define a template on a header file, every single compilation unit that includes that header ends up compiling its own implicit copy of the template for every different template argument usage made.

This means a lot of useless disk usage and compilation time.

Here is a concrete example, in which both main.cpp and notmain.cpp implicitly define MyTemplate<int> due to its usage in those files.

main.cpp

#include <iostream>

#include "mytemplate.hpp"
#include "notmain.hpp"

int main() {
    std::cout << notmain() + MyTemplate<int>().f(1) << std::endl;
}

notmain.cpp

#include "mytemplate.hpp"
#include "notmain.hpp"

int notmain() { return MyTemplate<int>().f(1); }

mytemplate.hpp

#ifndef MYTEMPLATE_HPP
#define MYTEMPLATE_HPP

template<class T>
struct MyTemplate {
    T f(T t) { return t + 1; }
};

#endif

notmain.hpp

#ifndef NOTMAIN_HPP
#define NOTMAIN_HPP

int notmain();

#endif

GitHub upstream.

Compile and view symbols with nm:

g++ -c -Wall -Wextra -std=c++11 -pedantic-errors -o notmain.o notmain.cpp
g++ -c -Wall -Wextra -std=c++11 -pedantic-errors -o main.o main.cpp
g++    -Wall -Wextra -std=c++11 -pedantic-errors -o main.out notmain.o main.o
echo notmain.o
nm -C -S notmain.o | grep MyTemplate
echo main.o
nm -C -S main.o | grep MyTemplate

Output:

notmain.o
0000000000000000 0000000000000017 W MyTemplate<int>::f(int)
main.o
0000000000000000 0000000000000017 W MyTemplate<int>::f(int)

So we see that a separate section is generated for every single method instantiation, and that each of of them takes of course space in the object files.

From man nm, we see that W means weak symbol, which GCC chose because this is a template function.

The reason it doesn't blow up at link time with multiple definitions is that the linker accepts multiple weak definitions, and just picks one of them to put in the final executable, and all of them are the same in our case, so all is fine.

The numbers in the output mean:

  • 0000000000000000: address within section. This zero is because templates are automatically put into their own section
  • 0000000000000017: size of the code generated for them

We can see this a bit more clearly with:

objdump -S main.o | c++filt

which ends in:

Disassembly of section .text._ZN10MyTemplateIiE1fEi:

0000000000000000 <MyTemplate<int>::f(int)>:
   0:   f3 0f 1e fa             endbr64 
   4:   55                      push   %rbp
   5:   48 89 e5                mov    %rsp,%rbp
   8:   48 89 7d f8             mov    %rdi,-0x8(%rbp)
   c:   89 75 f4                mov    %esi,-0xc(%rbp)
   f:   8b 45 f4                mov    -0xc(%rbp),%eax
  12:   83 c0 01                add    $0x1,%eax
  15:   5d                      pop    %rbp
  16:   c3                      retq

and _ZN10MyTemplateIiE1fEi is the mangled name of MyTemplate<int>::f(int)> which c++filt decided not to unmangle.

Solutions to the object redefinition problem

This problem can be avoided by using explicit instantiation and either:

  1. keep definition on hpp and add extern template on hpp for types which are going to be explicitly instantiated.

    As explained at: using extern template (C++11) extern template prevents a completely defined template from being instantiated by compilation units, except for our explicit instantiation. This way, only our explicit instantiation will be defined in the final objects:

    mytemplate.hpp

    #ifndef MYTEMPLATE_HPP
    #define MYTEMPLATE_HPP
    
    template<class T>
    struct MyTemplate {
        T f(T t) { return t + 1; }
    };
    
    extern template class MyTemplate<int>;
    
    #endif
    

    mytemplate.cpp

    #include "mytemplate.hpp"
    
    // Explicit instantiation required just for int.
    template class MyTemplate<int>;
    

    main.cpp

    #include <iostream>
    
    #include "mytemplate.hpp"
    #include "notmain.hpp"
    
    int main() {
        std::cout << notmain() + MyTemplate<int>().f(1) << std::endl;
    }
    

    notmain.cpp

    #include "mytemplate.hpp"
    #include "notmain.hpp"
    
    int notmain() { return MyTemplate<int>().f(1); }
    

    Downsides:

    • the definition stays in the header, making single file change recompiles to that header possibly slow
    • if you are header only library, you force external projects to do their own explicit instantiation. If you are not a header-only library, this solution is likely the best.
    • if the template type is defined in your own project and not a built-in like int, it seems that you are forced to add the include for it on the header, a forward declaration is not enough: extern template & incomplete types This increases header dependencies a bit.
  2. moving the definition on the cpp file, leave only declaration on hpp, i.e. modify the original example to be:

    mytemplate.hpp

    #ifndef MYTEMPLATE_HPP
    #define MYTEMPLATE_HPP
    
    template<class T>
    struct MyTemplate {
        T f(T t);
    };
    
    #endif
    

    mytemplate.cpp

    #include "mytemplate.hpp"
    
    template<class T>
    T MyTemplate<T>::f(T t) { return t + 1; }
    
    // Explicit instantiation.
    template class MyTemplate<int>;
    

    Downside: external projects can't use your template with their own types. Also you are forced to explicitly instantiate all types. But maybe this is an upside since then programmers won't forget.

  3. keep definition on hpp and add extern template on every includer:

    mytemplate.cpp

    #include "mytemplate.hpp"
    
    // Explicit instantiation.
    template class MyTemplate<int>;
    

    main.cpp

    #include <iostream>
    
    #include "mytemplate.hpp"
    #include "notmain.hpp"
    
    // extern template declaration
    extern template class MyTemplate<int>;
    
    int main() {
        std::cout << notmain() + MyTemplate<int>().f(1) << std::endl;
    }
    

    notmain.cpp

    #include "mytemplate.hpp"
    #include "notmain.hpp"
    
    // extern template declaration
    extern template class MyTemplate<int>;
    
    int notmain() { return MyTemplate<int>().f(1); }
    

    Downside: all includers have to add the extern to their CPP files, which programmers will likely forget to do.

With any of those solutions, nm now contains:

notmain.o
                 U MyTemplate<int>::f(int)
main.o
                 U MyTemplate<int>::f(int)
mytemplate.o
0000000000000000 W MyTemplate<int>::f(int)

so we see have only mytemplate.o has a compilation of MyTemplate<int> as desired, while notmain.o and main.o don't because U means undefined.

Remove definitions from included headers but also expose templates an external API in a header-only library

If your library is not header only, the extern template method will work, since using projects will just link to your object file, which will contain the object of the explicit template instantiation.

However, for header only libraries, if you want to both:

  • speed up your project's compilation
  • expose headers as an external library API for others to use it

then you can try one of the following:

    • mytemplate.hpp: template definition
    • mytemplate_interface.hpp: template declaration only matching the definitions from mytemplate_interface.hpp, no definitions
    • mytemplate.cpp: include mytemplate.hpp and make explicit instantitations
    • main.cpp and everywhere else in the code base: include mytemplate_interface.hpp, not mytemplate.hpp
    • mytemplate.hpp: template definition
    • mytemplate_implementation.hpp: includes mytemplate.hpp and adds extern to every class that will be instantiated
    • mytemplate.cpp: include mytemplate.hpp and make explicit instantitations
    • main.cpp and everywhere else in the code base: include mytemplate_implementation.hpp, not mytemplate.hpp

Or even better perhaps for multiple headers: create an intf/impl folder inside your includes/ folder and use mytemplate.hpp as the name always.

The mytemplate_interface.hpp approach looks like this:

mytemplate.hpp

#ifndef MYTEMPLATE_HPP
#define MYTEMPLATE_HPP

#include "mytemplate_interface.hpp"

template<class T>
T MyTemplate<T>::f(T t) { return t + 1; }

#endif

mytemplate_interface.hpp

#ifndef MYTEMPLATE_INTERFACE_HPP
#define MYTEMPLATE_INTERFACE_HPP

template<class T>
struct MyTemplate {
    T f(T t);
};

#endif

mytemplate.cpp

#include "mytemplate.hpp"

// Explicit instantiation.
template class MyTemplate<int>;

main.cpp

#include <iostream>

#include "mytemplate_interface.hpp"

int main() {
    std::cout << MyTemplate<int>().f(1) << std::endl;
}

Compile and run:

g++ -c -Wall -Wextra -std=c++11 -pedantic-errors -o mytemplate.o mytemplate.cpp
g++ -c -Wall -Wextra -std=c++11 -pedantic-errors -o main.o main.cpp
g++    -Wall -Wextra -std=c++11 -pedantic-errors -o main.out main.o mytemplate.o

Output:

2

Tested in Ubuntu 18.04.

C++20 modules

https://en.cppreference.com/w/cpp/language/modules

I think this feature will provide the best setup going forward as it becomes available, but I haven't checked it yet because it is not yet available on my GCC 9.2.1.

You will still have to do explicit instantiation to get the speedup/disk saving, but at least we will have a sane solution for "Remove definitions from included headers but also expose templates an external API" which does not require copying things around 100 times.

Expected usage (without the explicit insantiation, not sure what the exact syntax will be like, see: How to use template explicit instantiation with C++20 modules?) be something along:

helloworld.cpp

export module helloworld;  // module declaration
import <iostream>;         // import declaration
 
template<class T>
export void hello(T t) {      // export declaration
    std::cout << t << std::end;
}

main.cpp

import helloworld;  // import declaration
 
int main() {
    hello(1);
    hello("world");
}

and then compilation mentioned at https://quuxplusone.github.io/blog/2019/11/07/modular-hello-world/

clang++ -std=c++2a -c helloworld.cpp -Xclang -emit-module-interface -o helloworld.pcm
clang++ -std=c++2a -c -o helloworld.o helloworld.cpp
clang++ -std=c++2a -fprebuilt-module-path=. -o main.out main.cpp helloworld.o

So from this we see that clang can extract the template interface + implementation into the magic helloworld.pcm, which must contain some LLVM intermediate representation of the source: How are templates handled in C++ module system? which still allows for template specification to happen.

How to quickly analyze your build to see if it would gain a lot from template instantiation

So, you've got a complex project and you want to decide if template instantiation will bring significant gains without actually doing the full refactor?

The analysis below might help you decide, or at least select the most promising objects to refactor first while you experiment, by borrowing some ideas from: My C++ object file is too big

# List all weak symbols with size only, no address.
find . -name '*.o' | xargs -I{} nm -C --size-sort --radix d '{}' |
  grep ' W ' > nm.log

# Sort by symbol size.
sort -k1 -n nm.log -o nm.sort.log

# Get a repetition count.
uniq -c nm.sort.log > nm.uniq.log

# Find the most repeated/largest objects.
sort -k1,2 -n nm.uniq.log -o nm.uniq.sort.log

# Find the objects that would give you the most gain after refactor.
# This gain is calculated as "(n_occurences - 1) * size" which is
# the size you would gain for keeping just a single instance.
# If you are going to refactor anything, you should start with the ones
# at the bottom of this list. 
awk '{gain = ($1 - 1) * $2; print gain, $0}' nm.uniq.sort.log |
  sort -k1 -n > nm.gains.log

# Total gain if you refactored everything.
awk 'START{sum=0}{sum += $1}END{print sum}' nm.gains.log

# Total size. The closer total gain above is to total size, the more
# you would gain from the refactor.
awk 'START{sum=0}{sum += $1}END{print sum}' nm.log

The dream: a template compiler cache

I think the ultimate solution would be if we could build with:

g++ --template-cache myfile.o file1.cpp
g++ --template-cache myfile.o file2.cpp

and then myfile.o would automatically reuse previously compiled templates across files.

This would mean 0 extra effort on the programmers besides passing that extra CLI option to your build system.

A secondary bonus of explicit template instantiation: help IDEs list template instantiations

I've found that some IDEs such as Eclipse cannot resolve "a list of all template instantiations used".

So e.g., if you are inside a templated code, and you want to find possible values of the template, you would have to find the constructor usages one by one and deduce the possible types one by one.

But on Eclipse 2020-03 I can easily list explicitly instantiated templates by doing a Find all usages (Ctrl + Alt + G) search on the class name, which points me e.g. from:

template <class T>
struct AnimalTemplate {
    T animal;
    AnimalTemplate(T animal) : animal(animal) {}
    std::string noise() {
        return animal.noise();
    }
};

to:

template class AnimalTemplate<Dog>;

Here's a demo: https://github.com/cirosantilli/ide-test-projects/blob/e1c7c6634f2d5cdeafd2bdc79bcfbb2057cb04c4/cpp/animal_template.hpp#L15

Another guerrila technique you could use outside of the IDE however would be to run nm -C on the final executable and grep the template name:

nm -C main.out | grep AnimalTemplate

which directly points to the fact that Dog was one of the instantiations:

0000000000004dac W AnimalTemplate<Dog>::noise[abi:cxx11]()
0000000000004d82 W AnimalTemplate<Dog>::AnimalTemplate(Dog)
0000000000004d82 W AnimalTemplate<Dog>::AnimalTemplate(Dog)

Tags:

C++

Templates