Search code examples
c++boostboost-spiritboost-range

Combine boost::spirit and boost::any_range?


The function boost::spirit::qi::parse() expects two iterators to define the input range. This works well if I try to parse from std::string or std::istream. Now I want implement a more generic interface for my parser. One approach was to use boost::any_range to define the input. Here is my test code it compiles but throws an exception: "string iterator not dereferencable".

Second question. How can I combine boost::any_range together with boost::spirit::classic::position_iterator to detected a possible error position?

#include <boost/range/any_range.hpp>
#include <boost/spirit/include/classic_position_iterator.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/support_multi_pass.hpp>

namespace qi = boost::spirit::qi;

typedef boost::any_range<
    char,
    boost::forward_traversal_tag,
    char,
    std::ptrdiff_t
> input_type;

template < typename _Iterator >
    struct decode
    : qi::grammar< _Iterator >
    {
        decode( ) : decode::base_type( m_rule )
        {
            m_rule = qi::int_;

            BOOST_SPIRIT_DEBUG_NODES( ( m_rule ) )
        }

        qi::rule< _Iterator > m_rule;
    };

bool parse( const input_type& in, int& out )
{
    // We use a stream iterator to access the given stream:
    typedef boost::spirit::multi_pass<
        input_type::const_iterator
    > stream_iterator;

    // Create begin iterator for given stream:
    stream_iterator sBegin = boost::spirit::make_default_multi_pass( input_type::const_iterator( in.begin( ) ) );
    stream_iterator sEnd   = boost::spirit::make_default_multi_pass( input_type::const_iterator( ) );

    // Create an instance of the used grammar:
    decode<
        stream_iterator
    > gr;

    // Try to decode the data stored within the stream according the grammar and store the result in the out variable:
    bool r = boost::spirit::qi::parse( sBegin,
                                       sEnd,
                                       gr,
                                       out );

    return r && sBegin == sEnd;
}

void main( )
{
    std::string in = "12345"; int out;

    parse( in, out );
}

Update

1.) I agree that there is a mistake in the default constructed sEnd iterator. Therefore I simplified my example and I think I misunderstood how to use the multi_pass iterator. In this case c0 is false (expected) and c1 is true (not expected). So what is the right way to use the multi_pass iterator?

#include <boost/range/any_range.hpp>
#include <boost/spirit/include/classic_position_iterator.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/support_multi_pass.hpp>

namespace qi = boost::spirit::qi;

typedef boost::any_range<
    char,
    boost::forward_traversal_tag,
    char,
    std::ptrdiff_t
> input_type;

bool parse( const input_type& in, int& out )
{
    //for( input_type::iterator i = in.begin( ); i != in.end( ); ++i )
    //{
    //    std::cout << *i;
    //}

    // We use a stream iterator to access the given stream:
    typedef boost::spirit::multi_pass<
        input_type::const_iterator,
        boost::spirit::iterator_policies::default_policy<                // Defaults:
            boost::spirit::iterator_policies::ref_counted,               // OwnershipPolicy: ref_counted
            boost::spirit::iterator_policies::buf_id_check,              // CheckingPolicy : buf_id_check
            boost::spirit::iterator_policies::buffering_input_iterator,  // InputPolicy    : buffering_input_iterator
            boost::spirit::iterator_policies::split_std_deque            // StoragePolicy  : split_std_deque
        >
    > stream_iterator;

    bool c0 = in.begin( ) == in.end( );

    // Create begin iterator for given stream:
    stream_iterator sBegin( in.begin( ) );
    stream_iterator sEnd(   in.end( )   );

    bool c1 = sBegin == sEnd;

    //for( stream_iterator i = sBegin; i != sEnd; ++i )
    //{
    //    std::cout << *i;
    //}

    return false;
}
void main( )
{
    std::string in = "12345"; int out;

    parse( in, out );
}

2.) Yes I can compile a new grammar instance for each type of input iterator. My idea was to hide the implementation detail (=boost::spirit) from the user and to give him a generic interface. Therefore I would like to avoid a template function.

3.) Yes I forgot to expose the attribute. It was only a quick & dirty example. Thanks for the hint.


Solution

  • The default constructed iterator is not equivalent to an end iterator for your range.

    That convention is usually only followed by input iterators.

    The parser keeps reading. Luckily you're on some kind of compiler/library implementation that detects the past-the-end access.


    In reality, can't you just compile a new grammar (decode<>) instance for input iterators? This is the whole point of generic programming in C++.


    UPDATE

    This is what I'd do:

    • note that do_parse (and everything Spirit, or indeed, Boost related) can be hidden in the cpp

    Live On Coliru

    #include <boost/spirit/include/qi.hpp>
    
    namespace mylib {
        struct public_api {
            int parse(std::string const& input);
            int parse(std::istream& stream);
        };
    
        template<typename It>
        static int do_parse(It f, It l) {
            namespace qi = boost::spirit::qi;
    
            int result;
            if (qi::parse(f, l, qi::int_, result))
                return result;
    
            throw std::runtime_error("parse failure");
        }
    
        int public_api::parse(std::string const& input) {
            return do_parse(input.begin(), input.end());
        }
    
        int public_api::parse(std::istream& stream) {
            boost::spirit::istream_iterator f(stream >> std::noskipws), l;
            return do_parse(f, l);
        }
    }
    
    int main()
    {
        std::istringstream iss("12345");
        std::string const s("23456");
    
        mylib::public_api api;
        std::cout << api.parse(s)   << "\n";
        std::cout << api.parse(iss) << "\n";
    }
    

    Prints

    23456
    12345