Search code examples
c++boostboost-spiritboost-spirit-qi

How to use boost spirit list operator with mandatory minimum amount of elements?


I would like to parse dot language (http://www.graphviz.org/content/dot-language). It's a graph definition language that defines nodes and connections between them. A typical statement looks like node1->node2->node3;. It would be nice to use a boost::spirit list operator % to make a list of nodes. A naive approach would be:

edge_stmt %=
    (
        node_or_subgraph(_r1) % (qi::eps(_r1) >> tok.diredgeop | tok.undiredgeop)
    ) >> -attr_list;

_r1 indicates if this is directed or undirected graph, diredgeop is a token for ->, undiredgeop is respectively a token for --.

The problem is the above code will succeed for just node1;, which is incorrect. In order to get a correct parser I have to somehow declare that there must be at least two elements in the list built by %. How?

The documentation says that a % b is equivalent to a >> *(omit[b] >> a), which is incorrect. One might want to try this:

edge_stmt %=
    (
        node_or_subgraph(_r1) >>
            +(
                qi::omit
                [
                    qi::eps(_r1) >> tok.diredgeop | tok.undiredgeop
                ] >>
                node_or_subgraph(_r1)
            )
    ) >> -attr_list;

But this code doesn't produce a vector, its synthesized attribute is a tuple.

I can try semantic actions of course, but is there an elegant alternative without sematic actions?


Solution

  • Making the list operator accept a minimum number of elements would require creating a brand new parser introducing that behaviour because, unlike repeat, it is not configured to do so. I hope the following example can help you understand how you can use a >> +(omit[b] >> a) to achieve what you want.

    Running on WandBox

    #include <iostream>
    #include <vector>
    #include <boost/spirit/include/qi.hpp>
    #include <boost/fusion/include/std_pair.hpp>
    
    namespace qi= boost::spirit::qi;
    
    void print(const std::vector<std::string>& data) 
    {
        std::cout << "{ ";
        for(const auto& elem : data) {
            std::cout << elem << " ";
        }
        std::cout << "} ";
    }
    
    void print(const std::pair<std::string,double>& data) 
    {
        std::cout << "[ " << data.first << ", " << data.second << " ]";
    }
    
    
    template <typename Parser,typename... Attrs>
    void parse(const std::string& str, const Parser& parser, Attrs&... attrs)
    {
        std::string::const_iterator iter=std::begin(str), end=std::end(str);
        bool result = qi::phrase_parse(iter,end,parser,qi::space,attrs...);
        if(result && iter==end) {
            std::cout << "Success.";
            int ignore[] = {(print(attrs),0)...};
            std::cout << "\n";
        } else {
            std::cout << "Something failed. Unparsed: \"" << std::string(iter,end) << "\"\n";
        }
    }
    
    template <typename Parser>
    void parse_with_nodes(const std::string& str, const Parser& parser) 
    {
        std::vector<std::string> nodes;
        parse(str,parser,nodes);
    }
    
    template <typename Parser>
    void parse_with_nodes_and_attr(const std::string& str, const Parser& parser) 
    {
        std::vector<std::string> nodes;
        std::pair<std::string,double> attr_pair;
        parse(str,parser,nodes,attr_pair);
    }
    
    int main()
    {
        qi::rule<std::string::const_iterator,std::string()> node=+qi::alnum;
        qi::rule<std::string::const_iterator,std::pair<std::string,double>(),qi::space_type> attr = +qi::alpha >> '=' >> qi::double_;
    
    
        parse_with_nodes("node1->node2", node % "->");
    
        parse_with_nodes_and_attr("node1->node2 arrowsize=1.0", node % "->" >> attr);
    
        parse_with_nodes("node1->node2", node >> +("->" >> node));
    
        //parse_with_nodes_and_attr("node1->node2 arrowsize=1.0", node >> +("->" >> node) >> attr); 
    
        qi::rule<std::string::const_iterator,std::vector<std::string>(),qi::space_type> at_least_two_nodes = node >> +("->" >> node);
        parse_with_nodes_and_attr("node1->node2 arrowsize=1.0", at_least_two_nodes >> attr);
    }