I'm using a parser which skips white space. At one point, I don't want to skip, so I want to use qi::lexeme
. However, this either does not compile or messes up my results. I especially can't grasp the last point. How are the attributes of a lexeme
handled?
Here is an example of what I'm trying to do:
#include <iostream>
#include <iomanip>
#include <string>
#include <boost/spirit/include/qi.hpp>
#include <boost/fusion/include/vector.hpp>
namespace qi = boost::spirit::qi;
namespace fu = boost::fusion;
struct printer_type
{
void operator() (int i) const
{
std::cout << i << ' ';
}
void operator() (std::string s) const
{
std::cout << '"' << s << '"' << ' ';
}
} printer;
int main() {
for (std::string str : { "1foo 13", "42 bar 13", "13cheese 8", "101pencil13" }) {
auto iter = str.begin(), end = str.end();
qi::rule<std::string::iterator, qi::blank_type, fu::vector<int, std::string, int>()> parser = qi::int_ >> +qi::alpha >> qi::int_;
fu::vector<int, std::string, int> result;
bool r = qi::phrase_parse(iter, end, parser, qi::blank, result);
std::cout << " --- " << std::quoted(str) << " --- ";
if (r) {
std::cout << "parse succeeded: ";
fu::for_each(result, printer);
std::cout << '\n';
} else {
std::cout << "parse failed.\n";
}
if (iter != end) {
std::cout << " Remaining unparsed: " << std::string(iter, str.end()) << '\n';
}
}
}
Notice this line:
qi::rule<std::string::iterator, qi::blank_type, fu::vector<int, std::string, int>()> parser =
qi::int_ >> +qi::alpha >> qi::int_;
Okay, so we want an int, then a string and then again an int. However, I don't want to skip white space between the first int and the string, here there must be no space. If I use lexeme, the synthesized attributes get messed up.
A run without lexeme
gives the following results:
--- "1foo 13" --- parse succeeded: 1 "foo" 13
--- "42 bar 13" --- parse succeeded: 42 "bar" 13
--- "13cheese 8" --- parse succeeded: 13 "cheese" 8
--- "101pencil13" --- parse succeeded: 101 "pencil" 13
So everything parses fine, which is good. However, the second example (42 bar 13
) should not parse successfully, so here is the result with lexeme
around the first int and the string (qi::lexeme[qi::int_ >> +qi::alpha] >> qi::int_;
):
" 0 "1foo 13" --- parse succeeded: 1 "
--- "42 bar 13" --- parse failed.
Remaining unparsed: 42 bar 13
--- "13cheese 8" --- parse succeeded: 13 " 0
" 0 "101pencil13" --- parse succeeded: 101 "
What!? I have not the slightest clue what is going on, I'm happy for any enlightment :)
Side question: I would like to leave out lexeme
entirely and define a subrule which does not skip. How can i specify the attributes in this case?
The subrule has then the attribute fusion::vector<int, std::string>()
, but I still want the main rule to have fusion::vector<int, std::string, int>()
as attribute, not fusion::vector<fusion::vector<int, std::string>, int>()
(which does not compile anyway).
Use no_skip
directive: qi::int_ >> qi::no_skip[+qi::alpha] >> qi::int_
--- "1foo 13" --- parse succeeded: 1 "foo" 13
--- "42 bar 13" --- parse failed.
Remaining unparsed: 42 bar 13
--- "13cheese 8" --- parse succeeded: 13 "cheese" 8
--- "101pencil13" --- parse succeeded: 101 "pencil" 13
https://wandbox.org/permlink/PdS14l0b3qjJwz5S
Sooo.... what!? I have not the slightest clue what is going on, i'm happy for any enlightment :)
As @llonesmiz mentioned the qi::lexeme[qi::int_ >> +qi::alpha] >> qi::int_
parser binds to tuple<tuple<int,std::string>,int>
and you have triggered
trac 8013 bug/misfeature twice here (the first time for the whole sequence parser, and the second time for the sequence inside lexeme)`.