Search code examples
c++boost-spiritboost-spirit-x3

Recursive rule in Spirit.X3


I want to parse a recursive grammar with Boost.Spirit x3, but it fails with a template instantiation depth problem.

The grammar looks like :

value: int | float | char | tuple
int: "int: " int_
float: "float: " real_ 
char: "char: " char_
tuple: "tuple: [" value* "]"

Here is a contained example:

#include <boost/fusion/adapted.hpp>
#include <boost/spirit/home/x3.hpp>
#include <string>
#include <vector>
#include <variant>

struct value: std::variant<int,float,std::vector<value>>
{ 
    using std::variant<int,float,std::vector<value>>::variant;

    value& operator=(float) { return *this; } 
    value& operator=(int) { return *this; } 
    value& operator=(std::vector<value>) { return *this; } 
};

using namespace boost::fusion;
namespace x3 = boost::spirit::x3;

using x3::skip;
using x3::int_;
using x3::real_parser;
using x3::char_;

x3::rule<class value_, value> const value_ = "value";
x3::rule<class o_tuple_, std::vector<value>> o_tuple_ = "tuple";

using float_p = real_parser<float, x3::strict_real_policies<float>>;


const auto o_tuple__def = "tuple: " >> skip(boost::spirit::x3::space)["[" >> value_ % "," >> "]"];
BOOST_SPIRIT_DEFINE(o_tuple_)

const auto value__def
    = ("float: " >> float_p())
    | ("int: " >> int_)
    | o_tuple_
    ;

BOOST_SPIRIT_DEFINE(value_)

int main()
{
  std::string str;
  value val;

  using boost::spirit::x3::parse;
  auto first = str.cbegin(), last = str.cend();
  bool r = parse(first, last, value_, val);
}

This works if the line | o_tuple_ is commented (eg no recursion).


Solution

  • This is a common problem with recursive in X3. It's yet unresolved.

    I think I understand the issue is because of x3::skip alters the context object¹. Indeed, dropping that makes the thing compile, and successfully parse some trivial test cases:

    "float: 3.14",
    "int: 3.14",
    "tuple: [float: 3.14,int: 3]",
    

    However, obviously the following do not parse without the skipper:

    // the following _should_ have compiled with the original skip() configuration:
    "tuple: [ float: 3.14,\tint: 3 ]",
    

    Now, I venture that you can get rid of the problem by applying the skipper at the top level (which means that the context is identical for all rules involved in the instantiation "cycle"). If you do, you will at once start accepting more flexible whitespace in the input:

    // the following would not have parsed with the original skip() configuration:
    "float:3.14",
    "int:3.14",
    "tuple:[float: 3.14,int: 3]",
    "tuple:[float:3.14,int:3]",
    "tuple: [ float:3.14,\tint:3 ]",
    

    None of these would have parsed with the original approach, even if it had compiled successfully.

    What it takes

    Here's some of the tweaks I made to the code.

    1. removed the impotent assignment operators value::operator= (I don't know why you had them)

    2. add code to print a debug dump of any value:

      friend std::ostream& operator<<(std::ostream& os, base_type const& v) {
          struct {
              std::ostream& operator()(float const& f) const { return _os << "float:" << f; }
              std::ostream& operator()(int const& i)   const { return _os << "int:" << i; }
              std::ostream& operator()(std::vector<value> const& v) const { 
                  _os << "tuple: [";
                  for (auto& el : v) _os << el << ",";
                  return _os << ']';
              }
              std::ostream& _os;
          } vis { os };
      
          return std::visit(vis, v);
      }
      
    3. Drop the skipper and split out keywords from : interpunction:

      namespace x3 = boost::spirit::x3;
      
      x3::rule<struct value_class, value> const value_ = "value";
      x3::rule<struct o_tuple_class, std::vector<value> > o_tuple_ = "tuple";
      
      x3::real_parser<float, x3::strict_real_policies<float> > float_;
      
      const auto o_tuple__def = "tuple" >> x3::lit(':') >> ("[" >> value_ % "," >> "]");
      
      const auto value__def
          = "float" >> (':' >> float_)
          | "int" >> (':' >> x3::int_)
          | o_tuple_
          ;
      
      BOOST_SPIRIT_DEFINE(value_, o_tuple_)
      
    4. Now, the crucial step: add the skipper at toplevel:

      const auto entry_point = x3::skip(x3::space) [ value_ ];
      
    5. Create nice test driver main():

      int main()
      {
          for (std::string const str : {
                  "",
                  "float: 3.14",
                  "int: 3.14",
                  "tuple: [float: 3.14,int: 3]",
                  // the following _should_ have compiled with the original skip() configuration:
                  "tuple: [ float: 3.14,\tint: 3 ]",
                  // the following would not have parsed with the original skip() configuration:
                  "float:3.14",
                  "int:3.14",
                  "tuple:[float: 3.14,int: 3]",
                  "tuple:[float:3.14,int:3]",
                  "tuple: [ float:3.14,\tint:3 ]",
                  // one final show case for good measure
                  R"(
                  tuple: [
                     int  : 4,
                     float: 7e9,
                     tuple: [float: -inf],
      
      
                     int: 42
                  ])"
          }) {
              std::cout << "============ '" << str << "'\n";
      
              //using boost::spirit::x3::parse;
              auto first = str.begin(), last = str.end();
              value val;
      
              if (parse(first, last, parser::entry_point, val))
                  std::cout << "Parsed '" << val << "'\n";
              else
                  std::cout << "Parse failed\n";
      
              if (first != last)
                  std::cout << "Remaining input: '" << std::string(first, last) << "'\n";
          }
      }
      

    Live Demo

    See it Live On Coliru

    //#define BOOST_SPIRIT_X3_DEBUG
    #include <iostream>
    #include <boost/fusion/adapted.hpp>
    #include <boost/spirit/home/x3.hpp>
    #include <string>
    #include <vector>
    #include <variant>
    
    struct value: std::variant<int,float,std::vector<value>>
    { 
        using base_type = std::variant<int,float,std::vector<value>>;
        using base_type::variant;
    
        friend std::ostream& operator<<(std::ostream& os, base_type const& v) {
            struct {
                std::ostream& operator()(float const& f) const { return _os << "float:" << f; }
                std::ostream& operator()(int const& i)   const { return _os << "int:" << i; }
                std::ostream& operator()(std::vector<value> const& v) const { 
                    _os << "tuple: [";
                    for (auto& el : v) _os << el << ",";
                    return _os << ']';
                }
                std::ostream& _os;
            } vis { os };
    
            return std::visit(vis, v);
        }
    };
    
    namespace parser {
        namespace x3 = boost::spirit::x3;
    
        x3::rule<struct value_class, value> const value_ = "value";
        x3::rule<struct o_tuple_class, std::vector<value> > o_tuple_ = "tuple";
    
        x3::real_parser<float, x3::strict_real_policies<float> > float_;
    
        const auto o_tuple__def = "tuple" >> x3::lit(':') >> ("[" >> value_ % "," >> "]");
    
        const auto value__def
            = "float" >> (':' >> float_)
            | "int" >> (':' >> x3::int_)
            | o_tuple_
            ;
    
        BOOST_SPIRIT_DEFINE(value_, o_tuple_)
    
        const auto entry_point = x3::skip(x3::space) [ value_ ];
    }
    
    int main()
    {
        for (std::string const str : {
                "",
                "float: 3.14",
                "int: 3.14",
                "tuple: [float: 3.14,int: 3]",
                // the following _should_ have compiled with the original skip() configuration:
                "tuple: [ float: 3.14,\tint: 3 ]",
                // the following would not have parsed with the original skip() configuration:
                "float:3.14",
                "int:3.14",
                "tuple:[float: 3.14,int: 3]",
                "tuple:[float:3.14,int:3]",
                "tuple: [ float:3.14,\tint:3 ]",
                // one final show case for good measure
                R"(
                tuple: [
                   int  : 4,
                   float: 7e9,
                   tuple: [float: -inf],
    
    
                   int: 42
                ])"
        }) {
            std::cout << "============ '" << str << "'\n";
    
            //using boost::spirit::x3::parse;
            auto first = str.begin(), last = str.end();
            value val;
    
            if (parse(first, last, parser::entry_point, val))
                std::cout << "Parsed '" << val << "'\n";
            else
                std::cout << "Parse failed\n";
    
            if (first != last)
                std::cout << "Remaining input: '" << std::string(first, last) << "'\n";
        }
    }
    

    Prints

    ============ ''
    Parse failed
    ============ 'float: 3.14'
    Parsed 'float:3.14'
    ============ 'int: 3.14'
    Parsed 'int:3'
    Remaining input: '.14'
    ============ 'tuple: [float: 3.14,int: 3]'
    Parsed 'tuple: [float:3.14,int:3,]'
    ============ 'tuple: [ float: 3.14, int: 3 ]'
    Parsed 'tuple: [float:3.14,int:3,]'
    ============ 'float:3.14'
    Parsed 'float:3.14'
    ============ 'int:3.14'
    Parsed 'int:3'
    Remaining input: '.14'
    ============ 'tuple:[float: 3.14,int: 3]'
    Parsed 'tuple: [float:3.14,int:3,]'
    ============ 'tuple:[float:3.14,int:3]'
    Parsed 'tuple: [float:3.14,int:3,]'
    ============ 'tuple: [ float:3.14,  int:3 ]'
    Parsed 'tuple: [float:3.14,int:3,]'
    ============ '
                tuple: [
                   int  : 4,
                   float: 7e9,
                   tuple: [float: -inf],
    
    
                   int: 42
                ]'
    Parsed 'tuple: [int:4,float:7e+09,tuple: [float:-inf,],int:42,]'
    

    ¹ other directives do too, like x3::with<>. The problem would be that the context gets extended on each instantiation level, instead of "modified" to get the original context type back, and ending the instantiation cycle.