Search code examples
c++grammarboost-spiritboost-spirit-qi

How do I parse an expression with nested parenthesis with boost.Spirit?


I need to parse 1-line expressions containing key/value pairs and key/subexpression pairs, like:

123=a 456=b 789=(a b c) 111=((1=a 2=b 3=c) (1=x 2=y 3=z) (123=(x y z))) 666=evil

To make the parser simpler, I'm willing to do the parsing in several steps, separating the first-level tags (here 123, 456, 789, 111 and 666, and then parsing their content in another step. Here 789's value would be "a b c", 111's value would be (1=a 2=b 3=c) (1=x 2=y 3=z) (123=(x y z)).

But grammars beat me at this point so I can's figure out a way to get the expressions between matching parenthesis. All I get for 111 is (1=a 2=b 3=c, which ends at the first closing parenthesis.

I found this handy example and tried to use it, with no success:

#include <map>
#include <string>
#include <boost/spirit/include/classic.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/fusion/include/std_pair.hpp>

namespace qi = boost::spirit::qi;

void main()
{
    auto                                                                   value = +qi::char_("a-zA-Z_0-9");
    auto                                                                   key   =  qi::char_("a-zA-Z_") >> *qi::char_("a-zA-Z_0-9");
    qi::rule<std::string::iterator, std::pair<std::string, std::string>()> pair  =  key >> -('=' >> value);
    qi::rule<std::string::iterator, std::map<std::string, std::string>()>  query =  pair >> *((qi::lit(';') | '&') >> pair);

    std::string input("key1=value1;key2;key3=value3");  // input to parse
    std::string::iterator begin = input.begin();
    std::string::iterator end = input.end();

    std::map<std::string, std::string> m;        // map to receive results
    bool result = qi::parse(begin, end, query, m);   // returns true if successful
}

How I can do that ?

Edit: I found the example at http://boost-spirit.com/home/articles/qi-example/parsing-a-list-of-key-value-pairs-using-spirit-qi/


Solution

  • You could write it as:

    qi::rule<std::string::iterator, std::pair<std::string, std::string>()> pair = 
            key >> -(
               '=' >> ( '(' >> raw[query] >> ')' | value )
            )
        ;
    

    which will store all embedded queries as values (strings) associated with the key. This will drop the parenthesis from the stored values, though. If you still want to have the parenthesis stored in the returned attributes, use this:

    qi::rule<std::string::iterator, std::pair<std::string, std::string>()> pair = 
            key >> -(
               '=' >> ( raw['(' >> query >> ')'] | value )
            )
        ;