I have a very basic Boost Spirit Qi grammar to parse either IP port or IP port range, i.e. either "6322"
or "6322-6325"
.
The grammar looks like:
template<class It>
void init_port_rule(u16_rule<It>& port)
{
port = boost::spirit::qi::uint_parser<uint16_t, 10, 2, 5>();
port.name("valid port range: (10, 65535)");
}
typedef boost::fusion::vector
< std::uint16_t
, boost::optional<std::uint16_t>
> port_range_type
;
template<class It>
struct port_range_grammar
: boost::spirit::qi::grammar
< It
, port_range_type()
>
{
typedef typename port_range_grammar::base_type::sig_type signature;
port_range_grammar()
: port_range_grammar::base_type(start, "port_range")
{
init_port_rule(port);
using namespace boost::spirit::qi;
start = port > -(lit("-") > port);
}
private:
boost::spirit::qi::rule<It, signature> start;
boost::spirit::qi::rule<It, std::uint16_t()> port;
};
I am a bit stuck to define, that in a range port1
must be less than port2
. I think I have to use eps
parser here, but do not seem to find the proper way to specify it. Any suggestions are very welcome.
You can indeed use semantic actions. You don't always need to attach them to an eps
node, though. Here's what you'd get if you do:
port %= uint_parser<uint16_t, 10, 2, 5>() >> eps[ _pass = (_val>=10 && _val<=65535) ];
start = (port >> -('-' >> port)) >> eps(validate(_val));
Note that the one rule uses Simple Form eps
with semantic action attached. This requires operator%=
to still invoke automatic attribute propagation.
The second instance uses the Semantic Predicate form of eps
. The validate
function needs to be a Phoenix Actor, I defined it like:
struct validations {
bool operator()(PortRange const& range) const {
if (range.end)
return range.start<*range.end;
return true;
}
};
boost::phoenix::function<validations> validate;
Note you can use the second rule style on both rules like so:
port %= uint_parser<Port, 10, 2, 5>() >> eps(validate(_val));
start = (port >> -('-' >> port)) >> eps(validate(_val));
if you simply add an overload to validate a single port:
struct validations {
bool operator()(Port const& port) const {
return port>=10 && port<=65535;
}
bool operator()(PortRange const& range) const {
if (range.end)
return range.start<*range.end;
return true;
}
};
Let's define some nice edge cases and test them!
#include <boost/fusion/adapted/struct.hpp>
#include <boost/optional/optional_io.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
namespace qi = boost::spirit::qi;
using Port = std::uint16_t;
struct PortRange {
Port start;
boost::optional<Port> end;
};
BOOST_FUSION_ADAPT_STRUCT(PortRange, start, end)
template <class It, typename Attr = PortRange> struct port_range_grammar : qi::grammar<It, Attr()> {
port_range_grammar() : port_range_grammar::base_type(start, "port_range") {
using namespace qi;
port %= uint_parser<Port, 10, 2, 5>() >> eps(validate(_val));
start = (port >> -('-' >> port)) >> eps(validate(_val));
port.name("valid port range: (10, 65535)");
}
private:
struct validations {
bool operator()(Port const& port) const {
return port>=10 && port<=65535;
}
bool operator()(PortRange const& range) const {
if (range.end)
return range.start<*range.end;
return true;
}
};
boost::phoenix::function<validations> validate;
qi::rule<It, Attr()> start;
qi::rule<It, Port()> port;
};
int main() {
using It = std::string::const_iterator;
port_range_grammar<It> const g;
std::string const valid[] = {"10", "6322", "6322-6325", "65535"};
std::string const invalid[] = {"9", "09", "065535", "65536", "-1", "6325-6322"};
std::cout << " -------- valid cases\n";
for (std::string const input : valid) {
It f=input.begin(), l = input.end();
PortRange range;
bool accepted = parse(f, l, g, range);
if (accepted)
std::cout << "Parsed '" << input << "' to " << boost::fusion::as_vector(range) << "\n";
else
std::cout << "TEST FAILED '" << input << "'\n";
}
std::cout << " -------- invalid cases\n";
for (std::string const input : invalid) {
It f=input.begin(), l = input.end();
PortRange range;
bool accepted = parse(f, l, g, range);
if (accepted)
std::cout << "TEST FAILED '" << input << "' (returned " << boost::fusion::as_vector(range) << ")\n";
}
}
Prints:
-------- valid cases
Parsed '10' to (10 --)
Parsed '6322' to (6322 --)
Parsed '6322-6325' to (6322 6325)
Parsed '65535' to (65535 --)
-------- invalid cases
TEST FAILED '065535' (returned (6553 --))
CONGRATULATIONS We found a broken edge case
Turns out that by limiting uint_parser to 5 positions, we may leave characters in the input, so that 065535
parses as 6553
(leaving '5'
unparsed...). Fixing that is simple:
start = (port >> -('-' >> port)) >> eoi >> eps(validate(_val));
Or indeed:
start %= (port >> -('-' >> port)) >> eoi[ _pass = validate(_val) ];
Fixed version Live On Coliru
You will have noticed I revised your attribute type. Most of this is "good taste". Note, in practice you might want to represent your range as either single-port or range:
using Port = std::uint16_t;
struct PortRange {
Port start, end;
};
using PortOrRange = boost::variant<Port, PortRange>;
Which you would then parse like:
port %= uint_parser<Port, 10, 2, 5>() >> eps(validate(_val));
range = (port >> '-' >> port) >> eps(validate(_val));
start = (range | port) >> eoi;
Full demo Live On Coliru
You might think this will get unweildy to use. I agree!
Let's do without variant
or optional
in the first place. Let's make a single port just a range which happens to have start==end
:
using Port = std::uint16_t;
struct PortRange {
Port start, end;
};
Parse it like:
start = port >> -('-' >> port | attr(0)) >> eoi >> eps(validate(_val));
All we do in validate
is to check whether end
is 0
:
bool operator()(PortRange& range) const {
if (range.end == 0)
range.end = range.start;
return range.start <= range.end;
}
And now the output is: Live On Coliru
-------- valid cases
Parsed '10' to (10-10)
Parsed '6322' to (6322-6322)
Parsed '6322-6325' to (6322-6325)
Parsed '65535' to (65535-65535)
-------- invalid cases
Note how you can now always enumerate start
..end
without knowing whether there was a port or a port-range. This may be convenient (depending a bit on the logic you're implementing).