I'm just getting started with Boost::Spirit and have problems understanding what is going on in the following code:
#include <cstdio>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
namespace ph = boost::phoenix;
namespace qi = boost::spirit::qi;
namespace ascii = boost::spirit::ascii;
template <typename Iterator>
struct TestGrammar : qi::grammar<Iterator, std::string(), ascii::space_type>
{
qi::rule<Iterator, std::string(), ascii::space_type> expr;
qi::rule<Iterator, std::string(), ascii::space_type> tag;
std::string convertTag(std::string& tag)
{
printf("Tag: %s\n", tag.c_str());
tag = "_tag_" + tag;
return tag;
}
TestGrammar()
: TestGrammar::base_type(expr)
{
using qi::_1;
using qi::as_string;
using ascii::char_;
using namespace qi::labels;
// (1)
tag %= as_string[+char_] [ ph::bind(&TestGrammar::convertTag, this, _1) ];
// (2)
//tag = as_string[+char_] [ _val = ph::bind(&TestGrammar::convertTag, this, _1) ];
// (3)
//tag = as_string[+char_] [ _val += ph::bind(&TestGrammar::convertTag, this, _1) ];
expr = char_('!') >> tag;
}
};
int main(int argc, char** argv)
{
using ascii::space;
std::string str("!abc");
std::string::const_iterator beg = str.begin();
std::string::const_iterator end = str.end();
TestGrammar<std::string::const_iterator> expr;
std::string res;
bool r = phrase_parse(beg, end, expr, space, res);
if (r && beg == end) {
printf("Matched: %s\n", res.c_str());
} else {
printf("Didn't match!\n");
}
return 0;
}
This examples is supposed to parse tags (identifiers) with a leading '!', and spits them out in the same format, but with "_tag_" prepended to the tag (so "!abc" becomes "!_tag_abc"). It's just a minimal example to show my problem.
What I don't understand is what happens when I run this code using the auto-rule in (1). Instead of the expected output, I get "_tag_!abc", and indeed the printf()
in convertTag()
actually prints "!abc" for the tag. But why is that? I am passing _1
into convertTag()
, which I thought was supposed to be the attribute parsed by as_string[+char_]
, so how come it includes the '!' parsed in a completely different rule?
When I use rule (2) instead (which I believed would be equivalent to (1)), I instead get "_tag_abc", which seems to have dropped the initial '!', but why?
Rule (3) does what I want, although I have no idea why.
From (2) it seems to me that overwriting _val in the tag
rule actually overwrites the entire synthesized attribute of not only tag
, but also of expr
. Doesn't setting _val
in tag
only influence the synthesized attribute of tag
? And why the hell is there a '!' in my _1
in (1)?
// EDIT:
Whoops. I just realized that (2) and (3) probably makes absolutely no sense because it assigns the return value of ph::bind() (not of convertTag() itself) to _val
, which probably does not do what I want (or does it?). Still, the question remains why (1) isn't working the way I want it to.
Attributes are bound by reference. Since there's only one attribute exposed by expr
it follows that the same attribute must be bound to both char_('!')
and tag
. This is true, and explains all the issues.
The reason Spirit is ok with this is because the automatic attribute transformation and compatibility rules allow for sequences of parsers that expose (containers) of T
to propagate into a single container-of-T
attribute. This is so you can e.g. parse qi::alpha >> +(qi::alnum | qi::char_('_'))
.
So, when you take the attribute in the semantic action, you actually get the value of the bound reference, which is directly std::string res;
from main. Adding
std::cout << "Address: " << &tag << "\n";
and
std::cout << "Address: " << &res << "\n";
Shows they are the same:
Address: 0x7fffd54e5d00
Tag: !abc
Address: 0x7fffd54e5d00
Matched: _tag_!abc
See it Live On Coliru
Rule 3 does what you want because the presence of a semantic action PLUS the absense of operator %= assignment disables automatic attribute propagation. The upshot is that you get a different (temporary) string and the behaviour is what you intuitively expected.
Regarding the "whoops" I'm not actually sure. I think phx::bind
works differently from std::bind
(or the result is once more "magic" compatibility rules). Anyhow, I tend to avoid any confusion by using boost::phoenix::function
:
struct convertTag_f {
std::string operator()(std::string const& tag) const {
return "_tag_" + tag;
}
};
boost::phoenix::function<convertTag_f> convertTag;
TestGrammar() : TestGrammar::base_type(expr)
{
using namespace qi::labels;
tag = qi::as_string[+ascii::char_] [ _val += convertTag(_1) ];
expr = ascii::char_('!') >> tag;
}
See it Live On Coliru