I am trying to create an optional parser rule. Depending on the value of the first attribute, I want to optionally emits a data.
Example, for the input:
x,2,3
y,3,4
x,5,6
If the first character is a y
then the line should be discarded. Otherwise it will be processed. In this example, if the 3rd attribute is >= 4
then it is true. The synthesized attribute should be std::pair<bool, unsigned int>
where the unsigned int
value is the second attribute.
The parser is:
using namespace qi = boost::spirit::qi;
using Data = std::pair<bool, unsigned>;
BOOST_PHOENIX_ADAPT_FUNCTION(Data, make_pair, std::make_pair, 2);
class DataParser :
public qi::grammar<
std::string::iterator,
boost::spirit::char_encoding::ascii,
boost::spirit::ascii::space_type,
std::vector<Data>()
>
{
qi::rule<iterator_type, encoding_type, bool()> type;
qi::rule<iterator_type, encoding_type, bool()> side;
// doesn't compile: qi::rule<iterator_type, encoding_type, boost::spirit::ascii::space_type, boost::optional<Data>()> line;
qi::rule<iterator_type, encoding_type, boost::spirit::ascii::space_type, qi::locals<bool, unsigned, bool>, Data()> line;
qi::rule<iterator_type, encoding_type, boost::spirit::ascii::space_type, sig_type> start;
public:
DataParser()
: base_type(start)
{
using namespace qi::labels;
type = qi::char_[_val = _1 == 'x'];
side = qi::int_[_val = _1 >= 4];
line %= (qi::omit[type[_a = _1]] >> ',' >> qi::omit[qi::uint_[_b = _1]] >> ',' >> qi::omit[side[_c = _1]])[if_(_a)[_val = make_pair(_c, _b)]];
// doesn't compile: line %= (qi::omit[type[_a = _1]] >> ',' >> qi::omit[qi::uint_[_b = _1]] >> ',' >> qi::omit[side[_c = _1]])[if_(_a)[_val = make_pair(_c, _b)].else_[_val = qi::unused]];
// doesn't compile: line %= (type >> ',' >> qi::uint_ >> ',' >> side)[if_(_1)[_val = make_pair(_3, _2)]];
// doesn't compile: line %= (type >> ',' >> qi::uint_ >> ',' >> side)[if_(_1)[_val = make_pair(_3, _2)].else_[_val = unused]];
start = *line;
}
};
I get: [[false, 2], [false, 0], [true, 5]]
where I want to get: [[false, 2], [true, 5]]
(the second entry should be discarded).
I tried with boost::optional<Data>
for the data
rule and also to assign unused
to _val
but nothing worked.
The new rules are now:
using Data = std::pair<bool, unsigned>;
BOOST_PHOENIX_ADAPT_FUNCTION(Data, make_pair, std::make_pair, 2);
class DataParser :
public qi::grammar<
std::string::iterator,
boost::spirit::char_encoding::ascii,
boost::spirit::ascii::blank_type,
std::vector<Data>()
>
{
using Items = boost::fusion::vector<bool, unsigned, bool>;
qi::rule<iterator_type, encoding_type, bool()> type;
qi::rule<iterator_type, encoding_type, bool()> side;
qi::rule<iterator_type, encoding_type, boost::spirit::ascii::blank_type, Items()> line;
qi::rule<iterator_type, encoding_type, boost::spirit::ascii::blank_type, sig_type> start;
public:
DataParser()
: base_type(start)
{
using namespace qi::labels;
namespace px = boost::phoenix;
type = qi::char_[_val = _1 == 'x'];
side = qi::int_[_val = _1 >= 4];
line = type >> ',' >> qi::uint_ >> ',' >> side;
start = line[if_(_1)[px::push_back(_val, make_pair(_3, _2))]] % qi::eol;
}
};
The key points being to use the semantic action to decide if the synthesized attribute should be added by using all attributes of the previous rule, in this case line
.
Okay. You use lots of power-tools. But remember, with great power comes....
In particular, qi::locals, phoenix, semantic actions: they're all complicating life so only use them as a last resort (or when they're a natural fit, which is rarely¹).
Think directly,
start = *line;
line = // ....
When you say
If the first character is a y then the line should be discarded. Otherwise it will be processed.
You can express this directly:
line = !qi::lit('y') >> // ...
Alternatively, spell out what starters to accept:
line = qi::omit[ qi::char_("xz") ] >> // ...
Done.
Here I'll cheat by re-ordering the pair<unsigned, bool>
so it matches the input order. Now everything works out of the box without "any" magic:
line = !qi::lit('y') >> qi::omit[qi::alnum] >> ',' >> qi::int_ >> ',' >> side;
ignore = +(qi::char_ - qi::eol);
start = qi::skip(qi::blank) [ (line | ignore) % qi::eol ];
However it WILL result in the spurious entries as you noticed: Live On Compiler Explorer
Parsed: {(2, false), (0, false), (5, true)}
Now you could go hack around things by changing the eol
to also eat subsequent lines that don't appear to contain valid data lines. However, it becomes unwieldy, and we still have the desire to flip the pair's members.
So, here's where I think an actrion could be handy:
public:
DataParser() : DataParser::base_type(start) {
using namespace qi::labels;
start = qi::skip(qi::blank) [
(qi::char_ >> ',' >> qi::uint_ >> ',' >> qi::int_) [
_pass = process(_val, _1, _2, _3) ]
% qi::eol ];
}
private:
struct process_f {
template <typename... T>
bool operator()(Datas& into, char id, unsigned type, int side) const {
switch(id) {
case 'z': case 'x':
into.emplace_back(side >= 4, type);
break;
case 'y': // ignore
break;
case 'a':
return false; // fail the rule
}
return true;
}
};
boost::phoenix::function<action_f> process;
You can see, there's a nice separation of concerns now. You parse (char,int,int)
and conditionally process it. That's what's keeping this relatively simple compared to your attempts.
#include <boost/fusion/adapted.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <fmt/ranges.h>
namespace qi = boost::spirit::qi;
using Data = std::pair<bool, unsigned>;
using Datas = std::vector<Data>;
template <typename It>
class DataParser : public qi::grammar<It, Datas()> {
using Skipper = qi::blank_type;
qi::rule<It, Datas(), Skipper> line;
qi::rule<It, Datas()> start;
public:
DataParser() : DataParser::base_type(start) {
using namespace qi::labels;
start = qi::skip(qi::blank) [
(qi::char_ >> ',' >> qi::uint_ >> ',' >> qi::int_) [
_pass = process(_val, _1, _2, _3) ]
% qi::eol ];
}
private:
struct process_f {
template <typename... T>
bool operator()(Datas& into, char id, unsigned type, int side) const {
switch(id) {
case 'z': case 'x':
into.emplace_back(side >= 4, type);
break;
case 'y': // ignore
break;
case 'a':
return false; // fail the rule
}
return true;
}
};
boost::phoenix::function<process_f> process;
};
int main() {
using It = std::string::const_iterator;
DataParser<It> p;
for (std::string const input : {
"x,2,3\ny,3,4\nx,5,6",
})
{
auto f = begin(input), l = end(input);
Datas d;
auto ok = qi::parse(f, l, p, d);
if (ok) {
fmt::print("Parsed: {}\n", d);
} else {
fmt::print("Parsed failed\n", d);
}
if (f!=l) {
fmt::print("Remaining unparsed: '{}'\n", std::string(f,l));
}
}
}
Prints
Parsed: {(false, 2), (true, 5)}