Search code examples
c++variantstdvisitor-pattern

Which implementations are conformant for constant time dispatch of std::visit?


It is often quoted that C++s std::variant has worse performance than variants in other languages. See e.g. https://pbackus.github.io/blog/beating-stdvisit-without-really-trying.html

One of the reasons is the exception state, the other was said to be the requirement of the standard that std::visit complexity does not depend on the number of types which forces implementers to use a dispatch table of function pointers which inhibits certain optimizations like inlining.

Questions:

  1. Would it be legal to implement std::visit as a chain of if-elseif where each check is like if(v.index() == I) return visitor(get<I>(v));? It seems like it does not because dispatching to higher indices would take more checks for the index.
  2. Is it legal to implement it with a switch like switch(v.index()) case I: return visitor(get<I>(v));? This looks like "independent of the number of types" as control flow jumps directly to the case and it is how e.g. Microsoft implemented it for less than 64 (or so) types.
  3. If the switch is legal, is it legal for the optimizer to convert small switches to if-elseif ladders? Again the MSVC compiler does this: https://godbolt.org/z/D2Q5ED. But IMO this violates the "independent of number of types" complexity mandated by the standard although it would in practice be faster.

I'm asking based on a discussion on reddit where someone claimed "Constant time" again refers to the behavior in the limit. which I don't really agree with. IMO "Constant time" means the time is constant no matter the input which for an if-elseif-chain is not the case.

But I'm not sure how this applies here. It seems 1. is illegal, 2. is legal but the optimizer (may) transforms it into 1. which means that 2. is also illegal. At least if following the standard to the letter.


Solution

  • You are confusing time complexity and running time. See for example here .

    This being said, it's important to realize what statements like "constant time" and "does not depend on" mean when talking about complexity: In this context, both of them are just synonyms for O(1). So, the claim in the reddit comment is indeed correct.

    In the case of visit this means that there is some constant C such that visit never takes more than C cpu-cycles, even for an insane number of types in the variant. In particular, it is of course allowed to stay below this number C.

    So, to actually answer your question: Using an if-else if chain exclusively will usually have linear (i.e. O(n)) complexity, as you correctly state. So, unless the implementer knows for a fact that the compiler will optimize it into something constant, this is illegal.

    A switch will very likely be compiled to a jump table, which works in O(1). This is especially likely if the cases are a subsequent number of integers, as it is the case here. So, an implementation using switch should be legal. Note however, that for a difficult distribution of cases a switch will be compiled as a binary search which is O(log n). See e.g here.

    And lastly, given what I said earlier, it is of course allowed to apply any sorts of optimizations, as long as this doesn't move the algorithm into a worse complexity class. Even if that means that the actual running time does depend on the number of types, the time complexity does not.

    Update:

    Based on your comments, let me give you an examples of a function that is, complexity wise, independent on n (aka. "constant" or O(1)) while the actual runtime of course does depend on n:

    int get_max_of_first_50(std::vector<int> const& v)
    {
        int max = std::numeric_limits<int>::min();
        for (int i = 0; i < 50; ++i)
            if (v[i] > max)
                max = v[i];
    
        return max;
    }
    

    This function will always do 50 iterations or less, no matter how huge v.size() might be. Hence, complexity wise, it is classified as a constant algorithm, independent on n, or O(1); all three statements are equivalent.

    This may look like cheating, and to a certain degree I can understand this reasoning, but it is important to understand that this is the very reason why complexity was introduced this way in the first place:

    • Allowing certain simplifications in the analysis of algorithms. It can easily become very difficult otherwise
    • Caring only about huge inputs. Usually, nobody cares if something is done in 5ms or 10ms. But 5 days vs. 10 days does matter.

    BTW this way of "cheating" is very common. Another famous example is the implementation of std::sort. Basically, most library implementations that I've heard of use Quicksort (with some tricks, so it is O(n log(n)) even in the worst case). But for small sequences they usually switch to Insertion Sort which is provably faster in very small ranges, even though it is an algorithm with O(n^2) complexity by itself. But as already explained, from the point of view of complexity, this is fine as long as its usage is bound by a constant. And from the point of view of actual runtime, this is of course also fine as it is faster.