Search code examples
boost-spiritboost-spirit-qi

Questions about Spirit.Qi sequence operator and semantic actions


I have some questions about the sequence operator and semantic actions in Spirit Qi.

I'm trying to define a grammar rule for a floating point number that accepts metric prefixes (u, m, k, M, etc.) as well as the normal exponent form.

  rule<Iterator, std::string()> sign = char_("+-") [ _val = _1 ];
  rule<Iterator, std::string()> exp = char_("eE") >> -sign >> +digit;
  rule<Iterator, std::string()> suffix = char_("yzafpnumkKMGTPEZY") [ _val = _1 ];
  rule<Iterator, std::string()> mantissa = ((*digit >> char_('.') >> +digit) | (+digit >> char_('.') >> *digit));
  rule<Iterator, std::string()> unsigned_floating = (mantissa >> -(exp | suffix) | +digit >> (exp | suffix));
  rule<Iterator, std::string()> floating = -sign >> unsigned_floating;

Question 1: Why do I have to add a semantic action to the rule sign above? Isn't char convertible to std::string?

Question 2: Why does compilation fail when I try to merge the last two rules like this:

  rule<Iterator, std::string()> floating = -sign >> (mantissa >> -(exp | suffix) | +digit >> (exp | suffix));

Question 3: Let's say I want to let the attribute of floating be double and write a semantic action to do the conversion from string to double. How can I refer to the entire string matched by the rule from inside the semantic action?

Question 4: In the rule floating of Question 2, what does the placeholder _2 refer to and what is its type?

Edit:

I guess the last question needs some clarification:

What does the placeholder _2 refer to in the semantic action of the following rule, and what's its type?

  rule<Iterator, std::string()> floating = (-sign >> (mantissa >> -(exp | suffix) | +digit >> (exp | suffix))) [ _2 ];

Thanks!


Solution

  • First, blow-by-blow. See below for a out-of-the-box answer.

    Question 1: Why do I have to add a semantic action to the rule sign above? Isn't char convertible to std::string?

    Erm, no char is not convertible to string. See below for other options.

    Question 2: Why does compilation fail when I try to merge the last two rules like this:

    rule<Iterator, std::string()> floating = -sign >> 
                  (mantissa >> -(exp | suffix) | +digit >> (exp | suffix));
    

    This is due to the rules for atomic attribute assignment. The parser exposes something like

    vector2<optional<string>, variant<
          vector2<string, optional<string> >,
          vector2<std::vector<char>, optional<string> > >
    

    or similar (see the documentation for the parsers, I typed this in the browser from memory). This is, obviously, not assignable to string. Use qi::as<> to coerce atomic assignment. For convenience ***there is qi::as_string:

    floating = qi::as_string [ -sign >> (mantissa >> -(exp | suffix) | 
                                         +digit >> (exp | suffix)) ] 
    

    Question 3: Let's say I want to let the attribute of floating be double and write a semantic action to do the conversion from string to double. How can I refer to the entire string matched by the rule from inside the semantic action?

    You could use qi::as_string again, but the most appropriate would seem to be to use qi::raw:

    floating = qi::raw [ -sign >> (mantissa >> -(exp | suffix) | 
                                   +digit >> (exp | suffix)) ] 
           [ _val = parse_float(_1, _2) ];
    

    This parser directive exposes a pair of source iterators, so you can use it to refer to the exact input sequence matched.

    Question 4: In the rule floating of Question 2, what does the placeholder _2 refer to and what is its type?

    In general, to detect attribute types - that is, when the documentation has you confused or you want to double check your understanding of it - see the answers here:


    Out-of-the-box

    Have you looked at using Qi's builtin real_parser<> template, which can be comprehensively customized. It sure looks like you'd want to use that instead of doing custom parsing in your semantic action.

    The real_parser template with policies is both fast and very flexible and robust. See also the recent answer Is it possible to read infinity or NaN values using input streams?.

    For models of RealPolicies the following expressions must be valid:

    Expression                 | Semantics 
    ===========================+=============================================================================
    RP::allow_leading_dot      | Allow leading dot. 
    RP::allow_trailing_dot     | Allow trailing dot. 
    RP::expect_dot             | Require a dot. 
    RP::parse_sign(f, l)       | Parse the prefix sign (e.g. '-'). Return true if successful, otherwise false. 
    RP::parse_n(f, l, n)       | Parse the integer at the left of the decimal point. Return true if successful, otherwise false. If successful, place the result into n. 
    RP::parse_dot(f, l)        | Parse the decimal point. Return true if successful, otherwise false. 
    RP::parse_frac_n(f, l, n)  | Parse the fraction after the decimal point. Return true if successful, otherwise false. If successful, place the result into n. 
    RP::parse_exp(f, l)        | Parse the exponent prefix (e.g. 'e'). Return true if successful, otherwise false. 
    RP::parse_exp_n(f, l, n)   | Parse the actual exponent. Return true if successful, otherwise false. If successful, place the result into n. 
    RP::parse_nan(f, l, n)     | Parse a NaN. Return true if successful, otherwise false. If successful, place the result into n. 
    RP::parse_inf(f, l, n)     | Parse an Inf. Return true if successful, otherwise false. If successful, place the result into n
    

    See the example for a compelling idea of how you'd use it.