I want to check a file for all enums(this is just an MCVE so nothing complicated) and the name of the enums should be stored in an std::vector
I build my parsers like this:
auto const any = x3::rule<class any_id, const x3::unused_type>{"any"}
= ~x3::space;
auto const identifier = x3::rule<class identifier_id, std::string>{"identifier"}
= x3::lexeme[x3::char_("A-Za-z_") >> *x3::char_("A-Za-z_0-9")];
auto const enum_finder = x3::rule<class enum_finder_id, std::vector<std::string>>{"enum_finder"}
= *(("enum" >> identifier) | any);
When I am trying to parse a string with this enum_finder
into a std::vector
, the std::vector
also contains a lot of empty string.
Why is this parser also parsing empty strings into the vector?
I've assumed you want to parse "enum " out of free form text ignoring whitespaces.
What you really want is for ("enum" >> identifier | any)
to synthesize an optional<string>
. Sadly, what you get is variant<string, unused_type>
or somesuch.
The same happens when you wrap any
with x3::omit[any]
- it's still the same unused_type.
Plan B: Since you're really just parsing repeated enum-ids separated by "anything", why not use the list operator:
("enum" >> identifier) % any
This works a little. Now some tweaking: lets avoid eating "any" character by character. In fact, we can likely just consume whole whitespace delimited words: (note +~space
is equivalent +graph
):
auto const any = x3::rule<class any_id>{"any"}
= x3::lexeme [+x3::graph];
Next, to allow for multiple bogus words to be accepted in a row there's the trick to make the list's subject parser optional:
-("enum" >> identifier) % any;
This parses correctly. See a full demo:
#include <boost/spirit/home/x3.hpp>
namespace x3 = boost::spirit::x3;
namespace parser {
using namespace x3;
auto any = lexeme [+~space];
auto identifier = lexeme [char_("A-Za-z_") >> *char_("A-Za-z_0-9")];
auto enum_finder = -("enum" >> identifier) % any;
}
#include <iostream>
int main() {
for (std::string input : {
"",
" ",
"bogus",
"enum one",
"enum one enum two",
"enum one bogus bogus more bogus enum two !@#!@#Yay",
})
{
auto f = input.begin(), l = input.end();
std::cout << "------------ parsing '" << input << "'\n";
std::vector<std::string> data;
if (phrase_parse(f, l, parser::enum_finder, x3::space, data))
{
std::cout << "parsed " << data.size() << " elements:\n";
for (auto& el : data)
std::cout << "\t" << el << "\n";
} else {
std::cout << "Parse failure\n";
}
if (f!=l)
std::cout << "Remaining unparsed: '" << std::string(f,l) << "'\n";
}
}
Prints:
------------ parsing ''
parsed 0 elements:
------------ parsing ' '
parsed 0 elements:
------------ parsing 'bogus'
parsed 0 elements:
------------ parsing 'enum one'
parsed 1 elements:
one
------------ parsing 'enum one enum two'
parsed 1 elements:
one
------------ parsing 'enum one bogus bogus more bogus enum two !@#!@#Yay'
parsed 2 elements:
one
two