Search code examples
c++c++11istream-iteratorstd-bitset

Read bitset from file using istream_iterator


I'm trying to read a text file into a vector of std::bitset-objects. The file consists of 1's and spaces (for indicating 0 or false) and I'm trying to do this by overloading the input operator>> and using istream_iterators. I start by changing the spaces in the string to 0's and then construct a bit that I push to the vector with that string. However when I print the output looks very strange and it seems that I'm reading in more elements than lines in the file (the vector of bitsets is longer than the number of lines). The code looks as follows:

  
  template<long unsigned int N>
  std::istream& operator>>(std::istream& is, std::bitset<N>& b){
    std::string line;
    std::getline(is,line);
    for(unsigned long int i =0;i<N;++i){line[i]==' '? line[i]='0':'1';};
    std::cout << line << std::endl;
    b = std::bitset<N>(line);
    return is;
  }
 
 int main(){
  const unsigned long int N = 40;
  std::ifstream file("activities.txt");
  std::istream_iterator<std::bitset<N>> begin(file),end;
  std::vector<std::bitset<N>> bits;
  
  for(auto it = begin;it!=end;++it){
    bits.push_back(*it);
  }
  
  for(auto bit: bits){ std::cout << bit << std::endl;}
 }

I succeeded in doing the same thing without using the istream_iterator and the overloading of operator>> but I'm just trying to figure out why this way is not working.


Solution

  • Interesting question.

    The reaon, why this happens is that the std::bitsethas already an overwritten extraction operator >>. Please see here. You can read that this behaves like a formatted input function. So, it will read until the next white space and then stop. It will end reading, if we have an "end of file" or until it hits the length of the defined std::bitset.

    Accidently you have spaces in your source file, which will act as separator for this formatted input function, and will not lead to any problem.

    Actually, you should see a lot of "bitsets" in your std::vector, all consisting of many leading '0'es at the beginning and a few '1's with different length at the end.

    That is the result of reading many '1' islands and padding the string on the left with '0's, so that the final bitset has 40 characters.

    If you want C++ to call your function, you would need to open the std-namespace. While it could be done, this is strongly discouraged, except for template specializations. See:

    NOT recommended:

    #include <iostream>
    #include <fstream>
    #include <string>
    #include <iterator>
    #include <bitset>
    #include <vector>
    
    constexpr unsigned long int N = 40;
    namespace std {
        template<long unsigned int N>
        std::istream& operator>>(std::istream& is, std::bitset<N>& b) {
            std::string line;
            std::getline(is, line);
            for (unsigned long int i = 0; i < N; ++i) { line[i] == ' ' ? line[i] = '0' : '1'; };
            std::cout << line << std::endl;
            b = std::bitset<N>(line);
            return is;
        }
    }
    
    int main() {
    
        std::ifstream file("activities.txt");
        std::istream_iterator<std::bitset<N>> begin(file), end;
        std::vector<std::bitset<N>> bits;
    
        for (auto it = begin; it != end; ++it) {
            bits.push_back(*it);
        }
    
        for (auto& bit : bits) { std::cout << bit << std::endl; }
    }
    

    This could basically work. But, should not be used. And, additionally, in my opinion there is a semantic bug in the for loop of the >>-operator, by comparing with "N" and not with "line.length()", which would be better. Because, if a line containes more than 40 characters, the spaces will not be converted, stay at ' ' (space) and lead to an exception for invalid input.

    Correcting that (in any case, longer strings in a line will be truncated) and using the allowed template specialization in the "std" namespace, you coud write:

    #include <iostream>
    #include <fstream>
    #include <string>
    #include <iterator>
    #include <bitset>
    #include <vector>
    
    constexpr unsigned long int N = 40;
    namespace std {
    
        template<>
        std::istream& operator>>(std::istream& is, std::bitset<N>& b) {
            std::string line;
            std::getline(is, line);
            for (unsigned long int i = 0; i < line.length(); ++i) { line[i] == ' ' ? line[i] = '0' : '1'; };
            //std::cout << line << std::endl;
            b = std::bitset<N>(line);
            return is;
        }
    }
    
    int main() {
    
        std::ifstream file("activities.txt");
        std::istream_iterator<std::bitset<N>> begin(file), end;
        std::vector<std::bitset<N>> bits;
    
        for (auto it = begin; it != end; ++it) {
            bits.push_back(*it);
        }
    
        for (auto& bit : bits) { std::cout << bit << std::endl; }
    }
    

    This would work.


    You can make your life even simpler by taking advantage of std::bitsets constructor number 3 which can interprete the spaces a 0es.

    And, please use the std::vectors range constructor (number 5).

    With that, the code will be even simpler:

    #include <iostream>
    #include <fstream>
    #include <string>
    #include <iterator>
    #include <bitset>
    #include <vector>
    
    constexpr unsigned long int N = 40;
    namespace std {
    
        template<>
        std::istream& operator>>(std::istream& is, std::bitset<N>& b) {
            std::string line;
            std::getline(is, line);
            b = std::bitset<N>(line, 0, N, ' ', '1');
            return is;
        }
    }
    
    int main() {
    
        std::ifstream file("activities.txt");
    
        std::vector<std::bitset<N>> bits(std::istream_iterator<std::bitset<N>>(file), {});
    
        for (auto& bit : bits) { std::cout << bit << std::endl; }
    }