Search code examples
c++logical-operatorsstd-bitset

binary logical operations with long bit sequences


C++ Problem:

The minimum number of 1's should be found in the result of the XOR operation between a randomly generated N Bit sequence (N up to 50) std::bitset<N> visit{0b01...110}; and a sequences of 1's, starting from std::bitset<N> ones{0b0},std::bitset<N> ones{0b1} and grows up to the MsB of visit so ones{0b111...111};

std::vector<int> countPenalty(std::bitset<N> set)
{
    std::bitset<N> closing{0b0};
    std::bitset<N> eins{0b1};
    std::vector<int> countPen; 
    countPen.reserve(N);

    while(std::bit_width(closing.to_ulong()) <= std::bit_width(set.to_ulong()))
    {  
        std::bitset<N> tempSet = set ^ closing; 
        countPen.push_back(std::popcount(tempSet.to_ulong()));
        closing <<= 1;
        closing |= eins;    
    }
     return countPen;
}


int main()
{
   std::string s{"-11--111"};
   std::bitset<N> visitorSet(s, 0, s.size(), '0', '1');
   std::vector<int> penalties = countPenalty(visitorSet);
   auto minRes = std::min_element(penalties.begin(), penalties.end());
   std::cout << *minRes << std::endl; //2
   return 0;
}

Tasks similar to that one can be found very often in job interview preparations or c++ coding homeworks or leetcode (some similar task description I saw one is mentioned below). But I read on stackoverflow that to use big std. bitsets for binary logic operations is not a efficient solution. Actually why not? But is there a more elegant way which doesn't need to many code lines? I mean nobody wants to code a super complicated code on a job interview. And isn't bitset a good solution especially since c++20?

"Real life context to the code":

There is a ticket counter with a person working there. It just makes sense to have the ticket counter open when there are also clients who want to buy a ticket. So there are two cases of penalties:

  • the ticket counter is open but there is no client
  • the ticket counter is closed but there would be clients.

They make every day a list if there where clients (yes 1,no 0) in the ith hour. This will look like the following ::string s{"01100111"}; for a day with 8 hours. Now there is the question if one should close the shop at specific hour to save money. The aim is to have as few penalties as possible.

Here a list of the visitors and the opening hours (0= ticket counter is open, 1= ticket counter is closed)

enter image description here

The XOR between the Visitor bitset and the opening/closing bitset gives me the set of penalties. There I can find the minimum of penalties which will show when it is best to close the ticket counter, to save money.

enter image description here


Solution

  • std::bitset is inefficient

    But I read on stackoverflow that to use big std. bitsets for binary logic operations is not a efficient solution.

    This is true. std::bitset has requirements which make it inefficient, such as std::bitset::set throwing exceptions when an index is out of range. This leads to significant codegen differences:

    #include <bitset>
    #include <bit>
    #include <cstdint>
    
    auto set(std::bitset<64> x, std::size_t i) {
        x.set(i);
        return x;
    }
    
    auto set(std::uint64_t x, std::uint64_t i) {
        return x | (std::uint64_t{1} << i);
    }
    

    The first function compiles to:

    set(std::bitset<64ul>, unsigned long):
            mov     rdx, rsi
            cmp     rsi, 64
            jae     .error_handling_stuff
            bts     rdi, rdx
            mov     rax, rdi
            ret
    .error_handling_stuff:
            [...]
    

    The second function compiles to:

    set(unsigned long, unsigned long):
            mov     rax, rdi
            bts     rax, rsi
            ret
    

    See live example at Compiler Explorer

    See more discussion here: What is the performance of std::bitset?

    std::bitset can be a good choice despite being inefficient

    But is there a more elegant way which doesn't need to many code lines? I mean nobody wants to code a super complicated code on a job interview. And isn't bitset a good solution especially since c++20?

    I would still use std::bitset because outside of specific performance-critical use cases, std::bitset is good enough. Worry about writing correct code first, then worry about making it efficient, if it's not efficient enough yet.

    See also When is optimisation premature?

    If you embrace std::bitset, you can improve code quality, which is going to impress your employer more than knowing about niche performance issues related to some standard library type:

    while(std::bit_width(closing.to_ulong()) <= std::bit_width(set.to_ulong()))
    {  
        countPen.push_back((set ^ closing).count()); // use count() instead of std::popcount
        closing <<= 1;
        closing.set(0); // use set(0) instead of | here
    }
    return countPen;
    

    If you're still worried about performance, you can create your own std::bitset-style container which doesn't perform run-time checks.