Search code examples
c++metaprogrammingboost-hana

Compile time efficient remove duplicates from a boost::hana tuple


I use the boost::hana to_map function to remove duplicates from boost::hana tuple of types. See it at the compiler explorer. The code works very well but compiles very long (~10s). I wonder if there exist a faster solution that is compatible with boost::hana tuple.

#include <boost/hana/map.hpp>
#include <boost/hana/pair.hpp>
#include <boost/hana/type.hpp>
#include <boost/hana/basic_tuple.hpp>
#include <boost/hana/size.hpp>

using namespace boost::hana;

constexpr auto to_type_pair = [](auto x) { return make_pair(typeid_(x), x); };

template <class Tuple>
constexpr auto remove_duplicate_types(Tuple tuple)
{
    return values(to_map(transform(tuple, to_type_pair)));
}


int main(){

    auto tuple = make_basic_tuple(
        0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
        , 11, 12, 13, 14, 15, 16, 17, 18, 19, 20
        , 21, 22, 23, 24, 25, 26, 27, 28, 29, 30
        , 31, 32, 33, 34, 35, 36, 37, 38, 39, 40
        , 41, 42, 43, 44, 45, 46, 47, 48, 49, 50
        , 51, 52, 53, 54, 55, 56, 57, 58, 59, 60
        , 61, 62, 63, 64, 65, 66, 67, 68, 69, 70
        );

    auto noDuplicatesTuple = remove_duplicate_types(tuple);

    // Should return 1 since there is only one distinct type in the tuple
    return size(noDuplicatesTuple);
}

Solution

  • I haven't run any benchmarks, but your example does not appear to take 10 seconds on Compiler Explorer. However, I can explain why it is a relatively slow solution, and suggest an alternative that assumes you are only interested getting a unique list of types and not retaining any run-time information in your result.

    Creating large tuples and/or instantiating function templates that have large tuples in their prototypes are expensive compile-time operations.

    Just your call to transform instantiates a lambda for each element which in turn instantiates pair. The input/output of this call are both large tuples.

    The call to to_map makes an empty map and recursively calls insert for each element each time making a new map, but in this simple case the intermediate result will always be hana::map<int>. I'm willing to bet that this is exploding your compile-times if your actual use case is non-trivial. (It was certainly an issue when we were implementing hana::map so we made hana::make_map avoid this since it has all of its inputs up front).

    All of this, and there is a significant penalty for these large function types being used in run-time code. You might notice a difference if you wrapped the operations in decltype and only used the resulting type.

    Alternatively, using raw template metaprogramming can sometimes yield performance results over function template based metaprogramming. Here is an example for your use case:

    #include <boost/hana/basic_tuple.hpp>
    #include <boost/mp11/algorithm.hpp>
    
    namespace hana = boost::hana;
    using namespace boost::mp11;
    
    
    int main() {
        auto tuple = hana::make_basic_tuple(
            0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
            , 11, 12, 13, 14, 15, 16, 17, 18, 19, 20
            , 21, 22, 23, 24, 25, 26, 27, 28, 29, 30
            , 31, 32, 33, 34, 35, 36, 37, 38, 39, 40
            , 41, 42, 43, 44, 45, 46, 47, 48, 49, 50
            , 51, 52, 53, 54, 55, 56, 57, 58, 59, 60
            , 61, 62, 63, 64, 65, 66, 67, 68, 69, 70
            );
    
        hana::basic_tuple<int> no_dups = mp_unique<std::decay_t<decltype(tuple)>>{};
    }
    

    https://godbolt.org/z/EnTWf6