Search code examples
c++c++11boost

boost::to_lower_copy() too expensive as it extracts a facet from the passed locale


I've got std::use_facet<std::ctype<char>> showing up in a profiler. It is called indirectly from boost::to_lower_copy(). I remember fixing this for a comparison operator by storing the reference returned from std::use_facet<> only once and repeatedly calling facet.tolower(char) with it, avoiding repeated calls to std::use_facet<>. Is there a way to avoid this short of writing one's own to_lower(const std::string&) function?


Solution

  • The use_facet<std::ctype<char>> object instance is created inside std::tolower() which is internally called per character by boost::to_lower_copy(). If your compiler doesn't optimize that out, even when all possible optimizations are enabled, I'm afraid you have no way to get around it. Boost does not provide any way to customize this function. Your only solution is to implement it yourself.

    Here is one possible solution for std::string (contrary to boost's implementation which is generalized for arbitrary sequence type):

    std::string to_lower_copy(const std::string& input,
        const std::locale& loc = std::locale())
    {
        auto const& facet = std::use_facet<std::ctype<char>>(loc);
    
        std::string out;
        out.reserve(input.size());
    
        std::transform(input.begin(), input.end(), std::back_inserter(out),
            [&facet](unsigned char c) { return facet.tolower(c); });
    
        return out;
    }
    

    Here is a small benchmark demonstrating ~5-7x performance advantage over boost:

    -----------------------------------------------------
    Benchmark           Time             CPU   Iterations
    -----------------------------------------------------
    BM_boost         4672 ns         2429 ns       443106
    BM_our            697 ns          311 ns      2585506
    

    Please mind that this approach is locale specific and doesn't work for arbitrary UTF-8 strings (same as boost).