Search code examples
c++linuxclangclang++

How to get templates instantiation statistics in clang on Linux?


I'm exploring different aspects of templates in C++ and want to dive into this topic a bit more.

For instance, I have a very simple function template in some header file:

template<typename T>
void foo(T value)
{
    (void)value;
}

And I want to get statistics how much time it has been instantiation during building some code. Ok, maybe getting this kind of statistics will be complicated. The overal amount of templates instantiations (10 times, 20 times and etc) during building some code will be enough for now.

For example I have the follings files: a.h a.cpp b.h b.cpp c.h c.cpp main.cpp and CMakeLists.txt. I completed the build process and want to get statistics.

I have tried to analyze the output of executing clang build with -ftime-report option, but it's not what I want to get. It's more about execution time and percentages of different parts of compilation and it's related to concrete translation unit.

Is it possible to get these statistics?


Solution

  • If just the number of instantiations is enough (not the time spent on them), this should work:

    #include <cstddef>
    #include <iostream>
    #include <utility>
    
    inline int &GetCounter()
    {
        static int ret = 0;
        return ret;
    }
    
    template <typename T>
    static const auto register_type = []{
        GetCounter()++;
        return nullptr;
    }();
    
    template <typename T>
    void foo(T value)
    {
        (void)value;
    
        // Instantiate `register_type<T>`.
        (void)std::integral_constant<const std::nullptr_t *, &register_type<T>>{};
    }
    
    int main()
    {
        std::cout << GetCounter() << '\n'; // 3
    
        if (false)
        {
            foo(1);
            foo(1.2);
            foo(1.2);
            foo(1.2f);
            foo(1.2f);
        }
    }
    

    Note static on the register_type variable, which allows us to count duplicate instantiations done by different TUs.

    I believe this also means that foo() technically violates the one definition rule (since in each TU it uses a different register_type variable), but this shouldn't cause any issues in practice, since when the linker deduplicates those instantiations, it doesn't matter which one gets picked, as all of them compile to the same assembly anyway.


    If you do need the time spent on instantiations, I believe Clang's -ftime-trace is your best option. I'd run it for every TU, then combine the results somehow.