Search code examples
c++boost-spirit-x3

boost spirit x3 error handler with expectations


I want to create a parser using boost::spirit::x3 for line based files, e.g. every line has the same structure and can be repeated. Further I want some detailed error description in case there is an error. Finally it should be possible that the file ends with a new line character.

Now I encountered some weird behavior in case I use x3::expect on the first element of the line. The error handler prints an error, but the overall parsing does not fail. Why is this happening? And how it can be fixed? If I do not expect the first element of the line, I do not get a detailed error description.

Here is an example to reproduce this problem:

#include <boost/spirit/home/x3.hpp>
#include <boost/spirit/home/x3/support/utility/annotate_on_success.hpp>
#include <boost/spirit/home/x3/support/utility/error_reporting.hpp>
#include <boost/fusion/include/define_struct.hpp>

#include <iostream>

namespace x3 = boost::spirit::x3;

struct error_handler
{
    template<typename Iterator, typename Exception, typename Context>
    x3::error_handler_result on_error(Iterator& first, Iterator const& last,
                                      Exception const& x,
                                      Context const& context)
    {
        auto& error_handler = x3::get<x3::error_handler_tag>(context).get();
        std::string message = "Error! Expecting: " + x.which() + " here:";
        error_handler(x.where(), message);
        return x3::error_handler_result::fail;
    }
};

namespace boost::spirit::x3 {
    template<>
    struct get_info<int_type>
    {
        std::string operator()(int_type const&) const
        { return "integral number"; }
    };

    template<>
    struct get_info<char_type>
    {
        std::string operator()(char_type const&) const
        { return "character"; }
    };
}  // namespace boost::spirit::x3

struct Line_tag : error_handler
{
};
struct File_tag : error_handler
{
};

BOOST_FUSION_DEFINE_STRUCT((), Data, (char, c)(int, x))
BOOST_FUSION_DEFINE_STRUCT((), DataContainer, (std::vector<Data>, data))

template<bool ExpectFirstElementOfLine>
DataContainer parse(std::string_view input)
{
    auto iter = input.cbegin();
    auto const end = input.cend();

    const auto charParser = []() {
        if constexpr (ExpectFirstElementOfLine)
            return x3::expect[x3::char_("a-zA-Z")];
        else
            return x3::char_("a-zA-Z");
    }();

    const auto line = x3::rule<Line_tag, Data>{"line"} = charParser > x3::int_;
    const auto file = x3::rule<File_tag, DataContainer>{"file"} = (line % x3::eol) >> -x3::eol >> x3::eoi;

    x3::error_handler<decltype(iter)> error_handler(iter, end, std::cout);
    DataContainer container;
    if (parse(iter, end, x3::with<x3::error_handler_tag>(std::ref(error_handler))[file], container))
    {
        if (iter != end)
            throw std::runtime_error("Remaining unparsed");
    }
    else
        throw std::runtime_error("Parse failed");

    return container;
}

template<bool ExpectFirstElementOfLine>
void testParse(std::string_view input)
{
    try
    {
        std::cout << "=========================" << std::endl;
        const auto container = parse<ExpectFirstElementOfLine>(input);
        std::cout << "Parsed [OK]: " << container.data.size() << std::endl;
    }
    catch (const std::exception& ex)
    {
        std::cout << "EXCEPTION: " << ex.what() << std::endl;
    }
}

int main()
{
    const std::string_view input1 = "x1\nx456";
    const std::string_view input2 = "x1\nx456\n";
    const std::string_view input3 = "x1\n456\n";
    // OK
    testParse<true>(input1);
    testParse<false>(input1);

    // parse succeeds but error handler prints message if expectation on first element of line is used
    testParse<true>(input2);
    testParse<false>(input2);

    // parsing fails but detailed error description only works if first element of line was expected
    testParse<true>(input3);
    testParse<false>(input3);
}

which yields:

=========================
Parsed [OK]: 2
=========================
Parsed [OK]: 2
=========================
In line 3:
Error! Expecting: char-set here:

^_
Parsed [OK]: 2
=========================
Parsed [OK]: 2
=========================
In line 2:
Error! Expecting: char-set here:
456
^_
EXCEPTION: Parse failed
=========================
EXCEPTION: Parse failed

Solution

    1. Why there is an expectation failure for testParse<true>("x1\nx456\n");?

      The (line % x3::eol) will run three times for that input:

      1. Try line -- ok (consume x1), try x3::eol -- ok (consume \n), repeat
      2. Try line -- ok (consume x456), try x3::eol -- ok (consume \n), repeat
      3. Try line, it tries x3::expect[x3::char_("a-zA-Z")], but fails -- here comes expectation failure
    2. The error handler prints an error, but the overall parsing does not fail. Why is this happening?

      When an expectation parser fails -- it throw an expectation_failure exception. However, when you set up an error handler for a rule -- the rule will catch the exception for you, and call your error handler. The error handler signals the rule with result of error handling by returning an appropriate value of error_handler_result enum type.

      Your error handler returns error_handler_result::fail -- this signals the rule to just fail the parsing, effectively turning expect[x] into x. In other word, your error handler is just a semantic action on failure (instead of on success for usual semantic actions).

      The list parser line % x3::eol is just line >> *(x3::eol >> line). Because your error handler turns any expectation failure to a regular failure, it should be obvious that after the first successful parsed line any fail will not fail the whole parsing.

    3. And how it can be fixed?

      You did not mention what you want exactly. If you just want that an expectation_failure exception propagate -- return a error_handler_result::rethrow from your error handler.