Confusion about Boost::Spirit auto-rule behavior

I'm just getting started with Boost::Spirit and have problems understanding what is going on in the following code:

#include <cstdio>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>

namespace ph = boost::phoenix;
namespace qi = boost::spirit::qi;
namespace ascii = boost::spirit::ascii;

template <typename Iterator>
struct TestGrammar : qi::grammar<Iterator, std::string(), ascii::space_type>
{
    qi::rule<Iterator, std::string(), ascii::space_type> expr;
    qi::rule<Iterator, std::string(), ascii::space_type> tag;

    std::string convertTag(std::string& tag)
    {
        printf("Tag: %s\n", tag.c_str());

        tag = "_tag_" + tag;
        return tag;
    }

    TestGrammar()
            : TestGrammar::base_type(expr)
    {
        using qi::_1;
        using qi::as_string;

        using ascii::char_;

        using namespace qi::labels;

        // (1)
        tag %= as_string[+char_] [ ph::bind(&TestGrammar::convertTag, this, _1) ];

        // (2)
        //tag = as_string[+char_] [ _val = ph::bind(&TestGrammar::convertTag, this, _1) ];

        // (3)
        //tag = as_string[+char_] [ _val += ph::bind(&TestGrammar::convertTag, this, _1) ];

        expr = char_('!') >> tag;
    }
};

int main(int argc, char** argv)
{
    using ascii::space;

    std::string str("!abc");

    std::string::const_iterator beg = str.begin();
    std::string::const_iterator end = str.end();

    TestGrammar<std::string::const_iterator> expr;

    std::string res;
    bool r = phrase_parse(beg, end, expr, space, res);

    if (r  &&  beg == end) {
        printf("Matched: %s\n", res.c_str());
    } else {
        printf("Didn't match!\n");
    }

    return 0;
}

This examples is supposed to parse tags (identifiers) with a leading '!', and spits them out in the same format, but with "_tag_" prepended to the tag (so "!abc" becomes "!_tag_abc"). It's just a minimal example to show my problem.

What I don't understand is what happens when I run this code using the auto-rule in (1). Instead of the expected output, I get "_tag_!abc", and indeed the printf() in convertTag() actually prints "!abc" for the tag. But why is that? I am passing _1 into convertTag(), which I thought was supposed to be the attribute parsed by as_string[+char_], so how come it includes the '!' parsed in a completely different rule?

When I use rule (2) instead (which I believed would be equivalent to (1)), I instead get "_tag_abc", which seems to have dropped the initial '!', but why?

Rule (3) does what I want, although I have no idea why.

From (2) it seems to me that overwriting _val in the tag rule actually overwrites the entire synthesized attribute of not only tag, but also of expr. Doesn't setting _val in tag only influence the synthesized attribute of tag? And why the hell is there a '!' in my _1 in (1)?

// EDIT:

Whoops. I just realized that (2) and (3) probably makes absolutely no sense because it assigns the return value of ph::bind() (not of convertTag() itself) to _val, which probably does not do what I want (or does it?). Still, the question remains why (1) isn't working the way I want it to.

Solution

Attributes are bound by reference. Since there's only one attribute exposed by expr it follows that the same attribute must be bound to both char_('!') and tag. This is true, and explains all the issues.

The reason Spirit is ok with this is because the automatic attribute transformation and compatibility rules allow for sequences of parsers that expose (containers) of T to propagate into a single container-of-T attribute. This is so you can e.g. parse qi::alpha >> +(qi::alnum | qi::char_('_')).

So, when you take the attribute in the semantic action, you actually get the value of the bound reference, which is directly std::string res; from main. Adding

    std::cout << "Address: " << &tag << "\n";

and

std::cout << "Address: " << &res << "\n";

Shows they are the same:

Address: 0x7fffd54e5d00
Tag: !abc
Address: 0x7fffd54e5d00
Matched: _tag_!abc

See it Live On Coliru

Other remarks:

Rule 3 does what you want because the presence of a semantic action PLUS the absense of operator %= assignment disables automatic attribute propagation. The upshot is that you get a different (temporary) string and the behaviour is what you intuitively expected.

Regarding the "whoops" I'm not actually sure. I think phx::bind works differently from std::bind (or the result is once more "magic" compatibility rules). Anyhow, I tend to avoid any confusion by using boost::phoenix::function:

struct convertTag_f {
    std::string operator()(std::string const& tag) const {
        return "_tag_" + tag;
    }
};
boost::phoenix::function<convertTag_f> convertTag;

TestGrammar() : TestGrammar::base_type(expr)
{
    using namespace qi::labels;

    tag  = qi::as_string[+ascii::char_] [ _val += convertTag(_1) ];
    expr = ascii::char_('!') >> tag;
}

See it Live On Coliru