Search code examples
c++parsingc++14boost-spiritboost-spirit-x3

Parsing a list of strings followed by a list of strings with spirit x3


I am trying to parse a string into a struct using boost spirit x3:

struct identifier {
    std::vector<std::string> namespaces;
    std::vector<std::string> classes;
    std::string identifier;

};

now I have a parser rule to match a strings like this:

foo::bar::baz.bla.blub
foo.bar
boo::bar
foo

my parser rule looks like this.

auto const nested_identifier_def =
        x3::lexeme[
                -(id_string % "::")
                >> -(id_string % ".")
                >> id_string
        ];

where id_string parses combinations of alphanum. I know this rule doesnt work to parse as I want it, because while parsing foo.bar for example this part of the rule -(id_string % ".") consumes the whole string. How can i change the rule to parse correctly in the struct?


Solution

  • Assuming your id_string is something like this:

    auto const id_string = x3::rule<struct id_string_tag, std::string>{} =
        x3::lexeme[
                (x3::alpha | '_')
            >> *(x3::alnum | '_')
        ];
    

    then I think this is what you're after:

    auto const nested_identifier_def =
           *(id_string >> "::")
        >> *(id_string >> '.')
        >>  id_string;
    

    Online Demo

    The issue is that p % delimit is shorthand for p >> *(delimit >> p), i.e. it always consumes one p after the delimiter. However what you want is *(p >> delimit) so that no p is consumed after the delimiter and is instead left for the next rule.