Search code examples
c++boostboost-spiritboost-spirit-x3

Mixing non-terminal rules from separeted translation unit


Introduction

I am trying to use two non-terminal rules while they are not defined in the same translation unit. A minimal example reproducing the issue is provided below, and is also available live on Coliru

TEST0
Re-using a rule directly (without embedding it into another rule) works OK, despite it is defined in another translation unit. This is the well known X3 program structure example from X3 documentation. This is the configuration TEST0 in the live test below.

TEST1
I initially avoided the use of the BOOST_SPIRIT_DEFINE/DECLARE/INSTANTIATE() macros for one of the non terminal rule with:

auto const parser2 
    = x3::rule<class u2,uint64_t>{"parser2"} 
    = "Trace Address: " >> parser1();

which resulted in an unresolved external symbol linker error. Surprisingly, the missing culprit is the parser1's symbol (and not parser2's), for which the BOOST_XXX macros are used (see unit1.cpp). This is the configuration TEST1

TEST2
I then moved to configuration TEST2 where BOOST_XXX macros are defined for the two rules. This solution compiles and runs with Visual Studio 2019 (v16.8.3) but produces a core dump with g++ (as can been seen on the test below).

Minimal example reproducing the issue

unit1.h

#ifndef UNIT1_H
#define UNIT1_H
#include <cstdint>
#include "boost/spirit/home/x3.hpp"
#include "boost/spirit/include/support_istream_iterator.hpp"

namespace x3 = boost::spirit::x3;
using iter_t = boost::spirit::istream_iterator;
using context_t = x3::phrase_parse_context<x3::ascii::space_type>::type;

namespace unit1 {
    using parser1_t = x3::rule<class u1, std::uint64_t>;
    BOOST_SPIRIT_DECLARE(parser1_t);
}

unit1::parser1_t const& parser1();

#endif /* UNIT1_H */

unit1.cpp

#include "unit1.h"

namespace unit1 {
    parser1_t const parser1 = "unit1_rule";
    auto const parser1_def = x3::uint_;
    BOOST_SPIRIT_DEFINE(parser1)
    BOOST_SPIRIT_INSTANTIATE(parser1_t, iter_t, context_t)
}
unit1::parser1_t const& parser1() { return unit1::parser1; }

main.cpp

#include <iostream>
#include "unit1.h"

namespace x3 = boost::spirit::x3;
#define TEST2

#ifdef TEST2
    auto const parser2 = x3::rule<class u2, uint64_t>{"parser2"};
    auto const parser2_def = "Trace address: " >> parser1();
    BOOST_SPIRIT_DECLARE(decltype(parser2))
    BOOST_SPIRIT_DEFINE(parser2)
    BOOST_SPIRIT_INSTANTIATE(decltype(parser2),iter_t,context_t)
#endif

int main(int argc, char* argv[])
{
    std::string input("Trace address: 123434");
    std::istringstream i(input);

    std::cout << "parsing: " << input << "\n";

    boost::spirit::istream_iterator b{i >> std::noskipws};
    boost::spirit::istream_iterator e{};

    uint64_t addr=0;
#ifdef TEST0
    bool v = x3::phrase_parse(b, e, "Trace address: " >> parser1(), x3::ascii::space,addr);
#elif defined TEST1
    auto const parser2 
        = x3::rule<class u2, uint64_t>{ "parser2" } 
        = "Trace address: " >> parser1();
    bool v = x3::phrase_parse(b, e, parser2, x3::ascii::space,addr);
#elif defined TEST2
    bool v = x3::phrase_parse(b, e, parser2, x3::ascii::space,addr);
#endif 
    std::cout << "result: " << (v ? "OK" : "Failed") << "\n";
    std::cout << "result: " << addr << "\n";
    return v;
}

I feel I am not doing these things correctly, here are my questions:

Unresolved external symbols and parser Context

In configuration TEST1 the error message is undefined reference to unit1::parse_rule<...> which means the parser1 is not instantiated with the right context. OK, but then what context shall I use in such situation ? Even if I move parser2 out of the main() function, I get more or less the same issue. I can display the context of course, and try to BOOST_SPIRIT_INSTANTIATE() with it but I feel this is not the way to go. Surprisingly, it seems instantiating the parser2 instead, solves the issue (on Visual Studio at least)

Mixing rules from separated translation units

Why is it so complicated, whereas if I remove the rule in parser2, every thing works ok ?


Solution

  • Q. Why is it so complicated [...]

    The machinary to statically link rule definitions to rules by their tag-type (rule-id) is tricky. It in fact hinges on there being a specialization of a parse_rule¹ function template.

    However, the function template depends on:

    • the rule id ("tag type")
    • iterator type
    • the context (includes things like skipper or with<> directives)

    All of the types must match exactly. This is a frequent source of error.

    Q. [...] whereas if I remove the rule in parser2, every thing works ok ?

    Likely because either the rule definition is visible to the compiler to instantiate at that point, or alternatively because the types match up as just described.

    I'll look at your specific code shortly.

    REPRO

    Reading The Compiler Messages

    My compiler warns with -DTEST1:

    unit1.h|13 col 5| warning: ‘bool unit1::parse_rule(unit1::parser1_t, Iterator&, const Iterator&, const Context&, boost::spirit::x3::rule<unit1::u1, long unsigned int>::attribute_type&) [with Iterator = boost::spirit::basic_istream_iterator<char>; Context = boost::spirit::x3::context<main()::u2, const boost::spirit::x3::sequence<boost::spirit::x3::literal_string<const char*, boost::spirit::char_encoding::standard, boost::spirit::x3::unused_type>, boost::spirit::x3::rule<unit1::u1, long unsigned int> >, boost::spirit::x3::context<boost::spirit::x3::skipper_tag, const boost::spirit::x3::char_class<boost::spirit::char_encoding::ascii, boost::spirit::x3::space_tag>, boost::spirit::x3::unused_type> >]’ used but never defined

    This spells the exact type arguments for the template specialization to explicitly-instantiate in a TU.

    The linker error spells the missing symbol:

    /home/sehe/custom/spirit/include/boost/spirit/home/x3/nonterminal/rule.hpp:135: undefined reference to bool unit1::parse_rule<boost::spirit::basic_istream_iterator<char, std::char_traits >, boost::spirit::x3::context<main::u2, boost::spirit::x3::sequence<boost::spirit::x3::literal_string<char const*, boost::spirit::char_encoding::standard, boost::spirit::x3::unused_type>, boost::spirit::x3::rule<unit1::u1, unsigned long, false> > const, boost::spirit::x3::context<boost::spirit::x3::skipper_tag, boost::spirit::x3::char_class<boost::spirit::char_encoding::ascii, boost::spirit::x3::space_tag> const, boost::spirit::x3::unused_type> >

    (boost::spirit::x3::rule<unit1::u1, unsigned long, false>, boost::spirit::basic_istream_iterator<char, std::char_traits >&, boost::spirit::basic_istream_iterator<char, std::char_traits > const&, boost::spirit::x3::context<main::u2, boost::spirit::x3::sequence<boost::spirit::x3::literal_string<char const*, boost::spirit::char_encoding::standard, boost::spirit::x3::unused_type>, boost::spirit::x3::rule<unit1::u1, unsigned long, false> > const, boost::spirit::x3::context<boost::spirit::x3::skipper_tag, boost::spirit::x3::char_class<boost::spirit::char_encoding::ascii, boost::spirit::x3::space_tag> const, boost::spirit::x3::unused_type> > const&, unsigned long&)'`

    All in all your task is to compare them (!!) and note the discrepancy.

    Reading The Macro Magic

    Expanding the macros gets

    template <typename Iterator, typename Context> inline bool parse_rule( decltype(parser1) , Iterator& first, Iterator const& last , Context const& context, decltype(parser1)::attribute_type& attr) { using boost::spirit::x3::unused; static auto const def_ = (parser1 = parser1_def); return def_.parse(first, last, context, unused, attr); }
    template bool parse_rule<iter_t, context_t>( parser1_t rule_ , iter_t& first, iter_t const& last , context_t const& context, parser1_t::attribute_type&);
    

    Which is for the ...DEFINE:

    template <typename Iterator, typename Context>
    inline bool parse_rule(decltype(parser1), Iterator& first,
        Iterator const& last, Context const& context,
        decltype(parser1)::attribute_type& attr)
    {
        using boost::spirit::x3::unused;
        static auto const def_ = (parser1 = parser1_def);
        return def_.parse(first, last, context, unused, attr);
    }
    

    And for the explicit ...INSTANTIATE:

    template bool parse_rule<iter_t, context_t>(parser1_t rule_, iter_t& first,
        iter_t const& last, context_t const& context,
        parser1_t::attribute_type&);
    

    Substituting out the types shows exactly what is instantiated (see the warning above).

    Other Options

    Short of straining my eyes, we know what template type params could be wrong, so let's check them:

    1. iterator:

      static_assert(std::is_same_v<iter_t, boost::spirit::istream_iterator>);
      iter_t b{i >> std::noskipws}, e {};
      

      This was not the culprit, the compiler confirms.

    2. The skipper ought to be x3::ascii::space_type which also seems to match up fine.

    3. The problem must be the context. Now let's extract the context from the linker error:

      bool unit1::parse_rule<...> >
      (x3::rule<unit1::u1, unsigned long, false>, iter_t &, iter_t const &,
      
       // this is the context:
       x3::context<
           main::u2,
           x3::sequence<x3::literal_string<char const *,
                                           boost::spirit::char_encoding::standard,
                                           x3::unused_type>,
                        x3::rule<unit1::u1, unsigned long, false>> const,
           x3::context<x3::skipper_tag,
                       x3::char_class<boost::spirit::char_encoding::ascii,
                                      x3::space_tag> const,
                       x3::unused_type>> const &,
      
       // this is the attribtue
       unsigned long &);
      

    Doesn't look like the context is actually what we expect. I reckon the problem is that the rule2 definition is "in sight" leading to the context containing the definition (this is the mechanism that allows local x3::rule definitions without define macro magic at all).

    I remember a more recent mailing list post pointing this out (and it was kind of a surprise to me back then): https://sourceforge.net/p/spirit/mailman/message/37194823/

    On di, 05. jan 13:12, Larry Evans wrote:

    However, there's another reason to use BOOST_SPIRIT_DEFINE. When there is a lot of recursive rules, and BOOST_SPIRIT_DEFINE is not used, this causes much heavier template processing and concomitant slow compile times. The reason is that, without BOOST_SPIRIT_DEFINE, the definition for a rule is stored in the context and this is what causes the explosion in compile-times.

    So, be aware of this when you notice compile times slow as you add more recursive rules.

    Thanks for pointing this out. I've run into this without realizing that omitting the definition-separation was a critical factor.

    I guess then that it also could provide relief in some cases that cause extreme template recursion when the rules change skipper (Because the context keeps being technically different).

    Again, this is actually a very helpful note. Thanks.

    Seth

    Earlier in the thread I express reasons why I dislike the macro machinery and never spread my X3 rules across TUs. By now you might appreciate that sentiment :)

    Workarounds

    You could workaround by manufacturing a correct context type and instantiate that (as well): (unit1.h)

    struct u2;
    using context2_t = x3::context<
        u2,
        decltype("" >> parser1_t{}) const,
        context_t>;
    
    BOOST_SPIRIT_DECLARE(parser1_t)
    

    And in the cpp:

    BOOST_SPIRIT_DEFINE(parser1)
    BOOST_SPIRIT_INSTANTIATE(parser1_t, iter_t, context_t) // optionally
    BOOST_SPIRIT_INSTANTIATE(parser1_t, iter_t, context2_t)
    

    Not surprisingly, this works: https://wandbox.org/permlink/Y6NsKCcIDgiHGJf2

    Summary

    To my own surprise, I once again learn a reason to dislike X3's rule separation magic. However, if you need it, you should probably not mix and match, but define parser2 out-of-line as well.

    namespace unit2 {
        parser2_t parser2 = "unit2_rule";
        auto const parser2_def = "Trace address: " >> parser1();
    
        BOOST_SPIRIT_DEFINE(parser2)
        BOOST_SPIRIT_INSTANTIATE(parser2_t, iter_t, context_t)
    } // namespace unit2
    

    See it Live On Wandbox again

    Full Listings

    For posterity from Wandbox:

    • File unit1.cpp

       #include "unit1.h"
      
       namespace unit1 {
           parser1_t parser1 = "unit1_rule";
           auto const parser1_def = x3::uint_;
      
           BOOST_SPIRIT_DEFINE(parser1)
           BOOST_SPIRIT_INSTANTIATE(parser1_t, iter_t, context_t)
       } // namespace unit1
       unit1::parser1_t const &parser1() { return unit1::parser1; }
      
    • File unit1.h

       #ifndef UNIT1_H
       #define UNIT1_H
       #include "boost/spirit/home/x3.hpp"
       #include "boost/spirit/include/support_istream_iterator.hpp"
       #include <cstdint>
      
       namespace x3    = boost::spirit::x3;
       using iter_t    = boost::spirit::istream_iterator;
       using context_t  = x3::phrase_parse_context<x3::ascii::space_type>::type;
      
       namespace unit1 {
           using parser1_t = x3::rule<class u1, std::uint64_t> const;
           BOOST_SPIRIT_DECLARE(parser1_t)
       } // namespace unit1
      
       unit1::parser1_t const &parser1();
      
       #endif /* UNIT1_H */
      
    • File unit2.cpp

       #include "unit2.h"
       #include "unit1.h"
      
       namespace unit2 {
           parser2_t parser2 = "unit2_rule";
           auto const parser2_def = "Trace address: " >> parser1();
      
           BOOST_SPIRIT_DEFINE(parser2)
           BOOST_SPIRIT_INSTANTIATE(parser2_t, iter_t, context_t)
       } // namespace unit2
       unit2::parser2_t const &parser2() { return unit2::parser2; }
      
    • File unit2.h

       #ifndef UNIT2_H
       #define UNIT2_H
       #include "boost/spirit/home/x3.hpp"
       #include "boost/spirit/include/support_istream_iterator.hpp"
       #include <cstdint>
      
       namespace x3    = boost::spirit::x3;
       using iter_t    = boost::spirit::istream_iterator;
       using context_t  = x3::phrase_parse_context<x3::ascii::space_type>::type;
      
       namespace unit2 {
           using parser2_t = x3::rule<class u2, std::uint64_t> const;
           BOOST_SPIRIT_DECLARE(parser2_t)
       } // namespace unit2
      
       unit2::parser2_t const &parser2();
      
       #endif /* UNIT2_H */
      
    • File main.cpp

       #include "unit2.h"
       #include <iostream>
      
       namespace x3 = boost::spirit::x3;
      
       int main() {
           std::string input("Trace address: 123434");
           std::istringstream i(input);
      
           std::cout << "parsing: " << input << "\n";
      
           static_assert(std::is_same_v<iter_t, boost::spirit::istream_iterator>);
           iter_t b{i >> std::noskipws}, e {};
      
           uint64_t addr = 0;
           bool v = x3::phrase_parse(b, e, parser2(), x3::ascii::space, addr);
           std::cout << "result: " << (v ? "OK" : "Failed") << "\n";
           std::cout << "result: " << addr << "\n";
           return v;
       }