Search code examples
c++parsingboostboost-spiritboost-spirit-qi

Constraining the existing Boost.Spirit real_parser (with a policy)


I want to parse a float, but not allow NaN values, so I generate a policy which inherits from the default policy and create a real_parser with it:

// using boost::spirit::qi::{real_parser,real_policies,
//                           phrase_parse,double_,char_};

template <typename T>
struct no_nan_policy : real_policies<T>
{
    template <typename I, typename A>
    static bool
    parse_nan(I&, I const&, A&) {
          return false;
    }    
};

real_parser<double, no_nan_policy<double> > no_nan;

// then I can use no_nan to parse, as in the following grammar
bool ok = phrase_parse(first, last, 
   no_nan[ref(valA) = _1] >> char_('@') >> double_[ref(b) = _1],
space);

But now I also want to ensure that the overall length of the string parsed with no_nan does not exceed 4, i.e. "1.23" or ".123" or even "2.e6" or "inf" is ok, "3.2323" is not, nor is "nan". I can not do that in the parse_n/parse_frac_n section of the policy, which separately looks left/right of the dot and can not communicate (...cleanly), which they would have to since the overall length is relevant.

The idea then was to extend real_parser (in boost/spirit/home/qi/numeric/real.hpp) and wrap the parse method -- but this class has no methods. Next to real_parser is the any_real_parser struct which does have parse, but these two structs do not seem to interact in any obvious way.

Is there a way to easily inject my own parse(), do some pre-checks, and then call the real parse (return boost::spirit::qi::any_real_parser<T, RealPolicy>::parse(...)) which then adheres to the given policies? Writing a new parser would be a last-resort method, but I hope there is a better way.

(Using Boost 1.55, i.e. Spirit 2.5.2, with C++11)


Solution

  • It seems I am so close, i.e. just a few changes to the double_ parser and I'd be done. This would probably be a lot more maintainable than adding a new grammar, since all the other parsing is done that way. – toting 7 hours ago

    Even more maintainable would be to not write another parser at all.

    You basically want to parse a floating point numbers (Spirit has got you covered) but apply some validations afterward. I'd do the validations in a semantic action:

    raw [ double_ [_val = _1] ] [ _pass = !isnan_(_val) && px::size(_1)<=4 ]
    

    That's it.

    Explanations

    Anatomy:

    • double_ [_val = _1] parses a double and assigns it to the exposed attribute as usual¹
    • raw [ parser ] matches the enclosed parser but exposes the raw source iterator range as an attribute
    • [ _pass = !isnan_(_val) && px::size(_1)<=4 ] - the business part!

      This semantic action attaches to the raw[] parser. Hence

      • _1 now refers to the raw iterator range that already parsed the double_
      • _val already contains the "cooked" value of a successful match of double_
      • _pass is a Spirit context flag that we can set to false to make parsing fail.

    Now the only thing left is to tie it all together. Let's make a deferred version of ::isnan:

    boost::phoenix::function<decltype(&::isnan)> isnan_(&::isnan);
    

    We're good to go.

    Test Program

    Live On Coliru

    #include <boost/spirit/include/qi.hpp>
    #include <boost/spirit/include/phoenix.hpp>
    #include <cmath>
    #include <iostream>
    
    int main ()
    {
        using It = std::string::const_iterator;
    
        auto my_fpnumber = [] { // TODO encapsulate in a grammar struct
            using namespace boost::spirit::qi;
            using boost::phoenix::size;
    
            static boost::phoenix::function<decltype(&::isnan)> isnan_(&::isnan);
    
            return rule<It, double()> (
                    raw [ double_ [_val = _1] ] [ _pass = !isnan_(_val) && size(_1)<=4 ]
                );
        }();
    
        for (std::string const s: { "1.23", ".123", "2.e6", "inf", "3.2323", "nan" })
        {
            It f = s.begin(), l = s.end();
    
            double result;
            if (parse(f, l, my_fpnumber, result))
                std::cout << "Parse success:  '" << s << "' -> " << result << "\n";
            else
                std::cout << "Parse rejected: '" << s << "' at '" << std::string(f,l) << "'\n";
        }
    }
    

    Prints

    Parse success:  '1.23' -> 1.23
    Parse success:  '.123' -> 0.123
    Parse success:  '2.e6' -> 2e+06
    Parse success:  'inf' -> inf
    Parse rejected: '3.2323' at '3.2323'
    Parse rejected: 'nan' at 'nan'
    

    ¹ The assignment has to be done explicitly here because we use semantic actions and they normally suppress automatic attribute propagation