I am writing a grammar that contains a rule for parsing email addresses. The rule is declared as:
qi::rule<Iterator, ascii::space_type, std::string()> email;
and its definition is:
email
=
qi::lexeme[
+ascii::alnum
>> *(qi::char_(".") >> +ascii::alnum)
>> qi::char_("@")
>> +ascii::alnum
>> +(qi::char_(".") >> +ascii::alnum)
]
When I parse a text using this grammar, the parser correctly matches the email address, but the rule's synthesized attribute does not correspond to the correct address. For example, if the text contains the address info.it@example.com, the synthesized attribute is info.@example. I think this is due to the kleen and plus operators.
I am using boost 1.48 and I have tested the code with boost 1.54 and in that version it works properly, but unfortunately I cannot upgrade to it in my project.
I can I work around this problem, maybe using semantic actions?
Interesting.
I suppose it has to do with a change in how container attributes get appended to by subsequent container-handling parser expressions.
I'm not going to install that library version, but here's a few things you can do:
NOTE
your pattern is not for general email addressing. This is much more complicated in reality. I'm assuming your rule is right for your internal requirements.
Your rule doesn't allow
..
anywhere, right? Assuming this is on purpose tooYour rule doesn't start
.
at the start or end of a substring either. Assuming this is on purpose too
Drop the skipper since the whole rule is a lexeme: (see Boost spirit skipper issues)
qi::rule<Iterator, std::string()> email;
email =
+ascii::alnum
>> *(qi::char_(".") >> +ascii::alnum)
>> qi::char_("@")
>> +ascii::alnum
>> +(qi::char_(".") >> +ascii::alnum)
;
Now, use either raw[]
or as_string[]
to gather the whole input:
qi::rule<Iterator, std::string()> email;
email = qi::as_string [
+ascii::alnum
>> *(qi::char_(".") >> +ascii::alnum)
>> qi::char_("@")
>> +ascii::alnum
>> +(qi::char_(".") >> +ascii::alnum)
];
Using raw[]
you don't even need the attribute capturing making the rule both more efficient and simpler:
qi::rule<Iterator, std::string()> email;
email = qi::raw [
+ascii::alnum >> *('.' >> +ascii::alnum)
>> '@'
>> +ascii::alnum >> +('.' >> +ascii::alnum)
];