I'm generally familiar with using qi::attr to implement a "default value" for a missing entry in parsed input. But I haven't seen how to do this when the default value needs to be pulled from an earlier parse.
I'm trying to parse into the following struct:
struct record_struct {
std::string Name;
uint8_t Distance;
uint8_t TravelDistance;
std::string Comment;
};
From a relatively simple "(text) (number) [(number)] [//comment]" format, where both the second number and the comment are optional. If the second number is not present, it's value should be set to the same as the first number.
What follows is a cut down example of working code that doesn't QUITE do what I want. This version just defaults to 0
rather than the correct value. If possible, I'd like to isolate the parsing of the two integers to a separate parser rule, without giving up using the fusion struct.
Things I've tried that haven't compiled:
qi::attr(0)
with qi::attr(qi::_2)
The full test code:
#include <string>
#include <cstdint>
#include <boost/spirit/include/qi.hpp>
struct record_struct {
std::string Name;
uint8_t Distance;
uint8_t TravelDistance;
std::string Comment;
};
BOOST_FUSION_ADAPT_STRUCT(
record_struct,
(std::string, Name)
(uint8_t, Distance)
(uint8_t, TravelDistance)
(std::string, Comment)
)
std::ostream &operator<<(std::ostream &o, const record_struct &s) {
o << s.Name << " (" << +s.Distance << ":" << +s.TravelDistance << ") " << s.Comment;
return o;
}
bool test(std::string s) {
std::string::const_iterator iter = s.begin();
std::string::const_iterator end = s.end();
record_struct result;
namespace qi = boost::spirit::qi;
bool parsed = boost::spirit::qi::parse(iter, end, (
+(qi::alnum | '_') >> qi::omit[+qi::space]
>> qi::uint_ >> ((qi::omit[+qi::space] >> qi::uint_) | qi::attr(0))
>> ((qi::omit[+qi::space] >> "//" >> +qi::char_) | qi::attr(""))
), result);
if (parsed) std::cout << "Parsed: " << result << "\n";
else std::cout << "Failed: " << std::string(iter, end) << "\n";
return parsed;
}
int main(int argc, char **argv) {
if (!test("Milan 20 22")) return 1;
if (!test("Paris 8 9 // comment")) return 1;
if (!test("London 5")) return 1;
if (!test("Rome 1 //not a real comment")) return 1;
return 0;
}
Output:
Parsed: Milan (20:22)
Parsed: Paris (8:9) comment
Parsed: London (5:0)
Parsed: Rome (1:0) not a real comment
Output I want to see:
Parsed: Milan (20:22)
Parsed: Paris (8:9) comment
Parsed: London (5:5)
Parsed: Rome (1:1) not a real comment
First of all, instead of spelling out omit[+space]
, just use a skipper:
bool parsed = qi::phrase_parse(iter, end, (
qi::lexeme[+(alnum | '_')]
>> uint_ >> (uint_ | attr(0))
>> (("//" >> lexeme[+qi::char_]) | attr(""))
), qi::space, result);
Here, qi::space
is the skipper. lexeme[]
avoids skipping inside the sub-expression (see Boost spirit skipper issues).
Next up, you can do it more than one way.
use a local attribute to temporarily store a value:
rule<It, record_struct(), locals<uint8_t>, space_type> g;
g %= lexeme[+(alnum | '_')]
>> uint_ [_a = _1] >> (uint_ | attr(_a))
>> -("//" >> lexeme[+char_]);
parsed = phrase_parse(iter, end, g, space, result);
This requires
qi::rule
declaration to declare the qi::locals<uint8_t>
; qi::_a
is the placeholder for that local attribute%=
so that semantic actions do not overrule attribute propagationThere's a wacky hybrid here where you don't actually use locals<>
but just refer to an external variable; this is in general a bad idea but as your parser is not recursive/reentrant you could do it
parsed = phrase_parse(iter, end, (
lexeme[+(alnum | '_')]
>> uint_ [ phx::ref(dist_) = _1 ] >> (uint_ | attr(phx::ref(dist_)))
>> (("//" >> lexeme[+char_]) | attr(""))
), space, result);
You could go full Boost Phoenix and juggle the values right from the semantic actions
parsed = phrase_parse(iter, end, (
lexeme[+(alnum | '_')]
>> uint_ >> (uint_ | attr(phx::at_c<1>(_val)))
>> (("//" >> lexeme[+char_]) | attr(""))
), space, result);
You could parse into optional<uint8_t>
and postprocess the information
std::string name;
uint8_t distance;
boost::optional<uint8_t> travelDistance;
std::string comment;
parsed = phrase_parse(iter, end, (
lexeme[+(alnum | '_')]
>> uint_ >> -uint_
>> -("//" >> lexeme[+char_])
), space, name, distance, travelDistance, comment);
result = { name, distance, travelDistance? *travelDistance : distance, comment };
I noticed this a little late:
If possible, I'd like to isolate the parsing of the two integers to a separate parser rule, without giving up using the fusion struct.
Well, of course you can:
rule<It, uint8_t(uint8_t)> def_uint8 = uint_parser<uint8_t>() | attr(_r1);
This is at once more accurate, because it doesn't parse unsigned values that don't fit in a uint8_t
. Mixing and matching from the above: Live On Coliru