Search code examples
c++boosttokenizeiterator-range

Tokenize string and store result in boost::iterator_range<std::string::iterator>


I need to tokenize (' ','\n','\t' as delimiter) a text with somethink like

std::string text = "foo   bar";
boost::iterator_range<std::string::iterator> r = some_func_i_dont_know(text);

Later I want to get output with:

for (auto i: result)
    std::cout << "distance: " << std::distance(text.begin(), i.begin())
        << "\nvalue: " << i << '\n';

What produces with example above:

distance: 0
value: foo
distance: 6
value: bar

Thanks for any help.


Solution

  • I would not use the ancient Tokenizer here. Just use String Algorithm's split offering:

    Live On Coliru

    #include <boost/algorithm/string.hpp>
    #include <iostream>
    
    using namespace boost;
    
    int main()
    {
        std::string text = "foo   bar";
        boost::iterator_range<std::string::iterator> r(text.begin(), text.end());
    
        std::vector<iterator_range<std::string::const_iterator> > result;
        algorithm::split(result, r, is_any_of(" \n\t"), algorithm::token_compress_on);
    
        for (auto i : result)
            std::cout << "distance: " << distance(text.cbegin(), i.begin()) << ", "
                      << "length: " << i.size() << ", "
                      << "value: '" << i << "'\n";
    }
    

    Prints

    distance: 0, length: 3, value: 'foo'
    distance: 6, length: 3, value: 'bar'