This question touches on two issues (as shown by the accepted answer), both of which are present in the version of Boost Spirit X3 that ships with Boost 1.64, but both of which are now fixed (or at least detected at compile time) in develop branch at the time of writing (2017-04-30).
I have updated the mcve project to reflect the changes that I made to use the develop branch instead of the latest boost release, in the hopes that it might help out others who face a similar issues.
I am trying to learn how to break up Spirit X3 parsers into separate reusable grammars, as encouraged by the example code (rexpr_full and calc in particular) and the presentations from CppCon 2015 and BoostCon.
I have a symbol table (essentially mapping different types to a enum class of the types I am supporting), which I would like to reuse in several parsers. The only example of symbol tables I could find is the roman numerals example, which is in a single source file.
When I try to move the symbol table into its own cpp/h file in the style of the more structured examples my parser will segfault if I try to parse any string which is not in the symbol table. If the symbol table is defined in the same compilation unit as the parsers that use it throws an expectation exception instead (which is what I would expect it to do).
With BOOST_SPIRIT_X3_DEBUG defined I get the following output:
<FruitType>
<try>GrannySmith: Mammals</try>
<Identifier>
<try>GrannySmith: Mammals</try>
<success>: Mammals</success>
<attributes>[[
Process finished with exit code 11
I have made a small project which shows what I am trying to achieve and is available here: https://github.com/sigbjornlo/spirit_fruit_mcve
My questions:
fruit
parser in one other parser, but in my full project I want to use it in several other parsers.)Below is the code for the MCVE project:
main.cpp
#include <iostream>
#define BOOST_SPIRIT_X3_DEBUG
#include <boost/spirit/home/x3.hpp>
#include <boost/fusion/include/adapt_struct.hpp>
#include "common.h"
#include "fruit.h"
namespace ast {
struct FruitType {
std::string identifier;
FRUIT fruit;
};
}
BOOST_FUSION_ADAPT_STRUCT(ast::FruitType, identifier, fruit);
namespace parser {
// Identifier
using identifier_type = x3::rule<class identifier, std::string>;
const auto identifier = identifier_type {"Identifier"};
const auto identifier_def = x3::raw[x3::lexeme[(x3::alpha | '_') >> *(x3::alnum | '_')]];
BOOST_SPIRIT_DEFINE(identifier);
// FruitType
struct fruit_type_class;
const auto fruit_type = x3::rule<fruit_type_class, ast::FruitType> {"FruitType"};
// Using the sequence operator creates a grammar which fails gracefully given invalid input.
// const auto fruit_type_def = identifier >> ':' >> make_fruit_grammar();
// Using the expectation operator causes EXC_BAD_ACCESS exception with invalid input.
// Instead, I would have expected an expectation failure exception.
// Indeed, an expectation failure exception is thrown when the fruit grammar is defined here in this compilation unit instead of in fruit.cpp.
const auto fruit_type_def = identifier > ':' > make_fruit_grammar();
BOOST_SPIRIT_DEFINE(fruit_type);
}
int main() {
std::string input = "GrannySmith: Mammals";
parser::iterator_type it = input.begin(), end = input.end();
const auto& grammar = parser::fruit_type;
auto result = ast::FruitType {};
bool successful_parse = boost::spirit::x3::phrase_parse(it, end, grammar, boost::spirit::x3::ascii::space, result);
if (successful_parse && it == end) {
std::cout << "Parsing succeeded!\n";
std::cout << result.identifier << " is a kind of " << to_string(result.fruit) << "!\n";
} else {
std::cout << "Parsing failed!\n";
}
return 0;
}
std::string to_string(FRUIT fruit) {
switch (fruit) {
case FRUIT::APPLES:
return "apple";
case FRUIT::ORANGES:
return "orange";
}
}
common.h
#ifndef SPIRIT_FRUIT_COMMON_H
#define SPIRIT_FRUIT_COMMON_H
namespace x3 = boost::spirit::x3;
enum class FRUIT {
APPLES,
ORANGES
};
std::string to_string(FRUIT fruit);
namespace parser {
using iterator_type = std::string::const_iterator;
using context_type = x3::phrase_parse_context<x3::ascii::space_type>::type;
}
#endif //SPIRIT_FRUIT_COMMON_H
fruit.h
#ifndef SPIRIT_FRUIT_FRUIT_H
#define SPIRIT_FRUIT_FRUIT_H
#include <boost/spirit/home/x3.hpp>
#include "common.h"
namespace parser {
struct fruit_class;
using fruit_grammar = x3::rule<fruit_class, FRUIT>;
BOOST_SPIRIT_DECLARE(fruit_grammar)
fruit_grammar make_fruit_grammar();
}
#endif //SPIRIT_FRUIT_FRUIT_H
fruit.cpp
#include "fruit.h"
namespace parser {
struct fruit_symbol_table : x3::symbols<FRUIT> {
fruit_symbol_table() {
add
("Apples", FRUIT::APPLES)
("Oranges", FRUIT::ORANGES);
}
};
struct fruit_class;
const auto fruit = x3::rule<fruit_class, FRUIT> {"Fruit"};
const auto fruit_def = fruit_symbol_table {};
BOOST_SPIRIT_DEFINE(fruit);
BOOST_SPIRIT_INSTANTIATE(fruit_grammar, iterator_type, context_type);
fruit_grammar make_fruit_grammar() {
return fruit;
}
}
Very good work on the reproducer. This reminded me of my PR https://github.com/boostorg/spirit/pull/229 (see analysis here Strange semantic behaviour of boost spirit x3 after splitting).
The problem would be Static Initialization Order Fiasco taking copies of the debug-names of rules before they became initialized.
In fact, indeed disabling the debug information does remove the crash, and correctly throws the expectation failure.
The same happens with the develop branch¹, so either there is another similar thing, or I missed a spot. For now, know you can disable the debug output. I'll post an update if I find the spot.
I didn't miss a spot. There's a separate issue in call_rule_definition
where it parameterizes the context_debug<>
helper class with the actual attribute type instead of the transformed one:
#if defined(BOOST_SPIRIT_X3_DEBUG)
typedef typename make_attribute::type dbg_attribute_type;
context_debug<Iterator, dbg_attribute_type>
dbg(rule_name, first, last, dbg_attribute_type(attr_), ok_parse);
#endif
The comment seems to suggest that this behaviour is as desired: it tries to print the attribute before transformation. However, it totally doesn't work unless the synthesized attribute type matches the actual attribute type. In this case, it makes context_debug
take a dangling reference to a temporary converted attribute, leading to Undefined Behaviour.
It's in fact also undefined behaviour in the working cases. I can only assume things happen to pan out nicely in the inline-definition case, making it seem like things work as intended.
To the best of my knowledge this would be a clean fix, preventing any unwarranted conversions and the temporaries that come with them:
#if defined(BOOST_SPIRIT_X3_DEBUG)
context_debug<Iterator, transform_attr>
dbg(rule_name, first, last, attr_, ok_parse);
#endif
I've created pull request for this: https://github.com/boostorg/spirit/pull/232
¹ develop branch doesn't seem merged into the 1.64 release