Search code examples
c++parsingboost-spiritstd-pair

use boost spirit parse int pair to vector


The string content is like:

20 10 5 3...

it is a list of pair of int. How to use spirit parse it to std::vector<std::pair<int, int>>?

    std::string line;
    std::vector<std::pair<int, int>> v;
    boost::spirit::qi::phrase_parse(
        line.cbegin(),
        line.cend(),
        (
                   ???
        ),
        boost::spirit::qi::space
    );

Solution

  • You could do a simple parser expression like *(int_ >> int_) (see the tutorial and these documentation pages).

    Live On Coliru

    #include <boost/spirit/include/qi.hpp>
    #include <boost/fusion/include/std_pair.hpp>
    
    namespace qi = boost::spirit::qi;
    
    int main() {
        std::string line = "20 10 5 3";
        std::vector<std::pair<int, int>> v;
        qi::phrase_parse(line.cbegin(), line.cend(), *(qi::int_ >> qi::int_), qi::space, v);
    
        for (auto& p : v) {
            std::cout << "(" << p.first << ", " << p.second << ")\n";
        }
    }
    

    Prints

    (20, 10)
    (5, 3)
    

    Pro Tip 1: Validity

    If you want to make sure there's no unwanted/unexpected input, check for remaining data:

    • check the iterators after parsing

      auto f = line.cbegin(), l = line.cend();
      qi::phrase_parse(f, l, *(qi::int_ >> qi::int_), qi::space, v);
      
      if (f!=l)
          std::cout << "Unparsed input '" << std::string(f,l) << "'\n";
      
    • or simple require qi::eoi as part of the parser expression and check the return value:

      bool ok = qi::phrase_parse(line.cbegin(), line.cend(), *(qi::int_ >> qi::int_) >> qi::eoi, qi::space, v);
      

    Pro Tip 2: "Look ma, no hands"

    Since the grammar is trivially the simplest thing that could parse into this datastructure, you can let Spirit do all the guesswork:

    Live On Coliru

    qi::phrase_parse(line.begin(), line.end(), qi::auto_, qi::space, v);
    

    That's, a grammar consisting of nothing but a single qi::auto_. Output is still:

    (20, 10)
    (5, 3)