Search code examples
c++xmltemplate-meta-programmingboost-fusionboost-preprocessor

Generate boilerplate code by transforming arguments to string literals


In one of my projects I'm trying to achieve a more generic approach for writing our in-house simplified XML files. For this I successfully used boost-fusion.

For every new XML file format the client has to write the following. Just assume, that the XML file contains a tag Person and a tag Company.

#include <boost/fusion/include/define_struct.hpp>
#include <boost/variant.hpp>
#include <map>
#include <vector>

BOOST_FUSION_DEFINE_STRUCT(
(),
Person,
(std::string, name) // name is mandatory for all tags
(int, age))

BOOST_FUSION_DEFINE_STRUCT(
(),
Company,
(std::string, name) // name is mandatory for all tags
(int, noEmployees)
(std::string, location)
)

typedef boost::variant<Person, Company> Types;

std::vector<std::pair<Types, std::vector<std::string>>> xmlTags =
{
    {Person(), {"name", "age"}},
    {Company(), {"name", "noEmployees", "location"}},
};

int main(int argc, char**args) {
}

I'm still not quite satisfied by the solution above, as the user still needs to define xmlTags, which should be generated automatically. Also Types should be generated as well. The client might forget to adapt the map, leading to erroneous XML files or crashing the XML Reader/Writer.

A good solution might look like:

DEFINE_XML_TAGS(
    XML_TAG(
        Person,
        (int, age)
    )
    XML_TAG(
        Company,
        (int, noEmployees)
        (std::string, location)
    )
)

Generating all this boilerplate code for me. I think that Boost-Preprocessor would be part of this good solution.

But, I have no idea how to accomplish the desired result. Haven't used this library ever. Fortunately, our compiler supports variadic template arguments.

Does anyone know how to accomplish the desired result?


Solution

  • If you are interested in using the Boost.Preprocessor library you need to familiarize yourself with two fundamental "data types": sequence and tuple. You can find the whole list of macros that the library uses in the reference section of the documentation. I'll explain the ones I use below.

    There are two macros in the interface: XML_TAG and DEFINE_XML_TAGS.
    XML_TAG is really simple, it just puts its arguments inside two sets of parentheses. This causes that however many XML_TAGs you use will be converted to a sequence which elements are tuples (struct_name,sequence_of_type_and_name_pairs).
    DEFINE_XML_TAGS is the macro that does all the work. It uses three helper macros GENERATE_STRUCT_DEFS, GENERATE_VARIANT_OF_TYPES and GENERATE_XMLTAGS.

    GENERATE_VARIANT_OF_TYPES
    Invokes ENUMERATE_TYPES(TAG_SEQ) in order to get a comma separated list of types. Right now TAG_SEQ is ((Person,(int,age)))((Company,(int,noEmployees)(std::string,location))) and we want to have Person,Company. BOOST_PP_ENUM(SEQ) takes a sequence and returns its elements separated by commas. So we need to have BOOST_PP_ENUM((Person)(Company)). BOOST_PP_SEQ_FOR_EACH(MACRO,DATA,SEQ) calls MACRO with each of the elements in SEQ and whichever DATA you pass. So BOOST_PP_SEQ_FOR_EACH(GET_TYPE_SEQUENCE,_,TAG_SEQ) calls GET_TYPE_SEQUENCE with (Person,(int,age)) and (Company,(int,noEmployees)(sd:string,location)). GET_TYPE_SEQUENCE then simply takes the first element of each tuple and puts it inside a set of parentheses using BOOST_PP_TUPLE_ELEM.

    GENERATE_XML_TAGS
    It calls GENERATE_PAIRS which in turn calls a SEQ_FOR_EACH using GENERATE_ONE_PAIR. As explained in the previous section GENERATE_ONE_PAIR gets each of the tuples (struct_name, sequence_of_type_name_pairs). It takes the name and adds a pair of parentheses after it and then calls GENERATE_VECTOR_OF_MEMBER_NAMES with the sequence_of_type_name_pairs. GENERATE_VECTOR_OF_MEMBER_NAMES first adds the mandatory "name" member and then does something with BOOST_PP_ENUM very similar to the macro explained above with the difference that it needs to do a little trick because the current sequence of tuples does not have two sets of parentheses (this is explained here in the 3rd approach). GENERATE_MEMBER_NAME_SEQUENCE then simply takes the name of the member, converts it to string and then puts a set of parentheses around it.

    GENERATE_STRUCT_DEFS
    BOOST_PP_REPEAT(N,MACRO,DATA) calls MACRO N times, passing DATA and the current repetition index. GENERATE_ONE_STRUCT_DEF takes the index-th element of the sequence and then takes firstly the name of the struct and lastly the sequence of type-name pairs and calls DO_GENERATE_ONE_STRUCT_DEF with those values. Finally DO_GENERATE_ONE_STRUCT_DEF builds the BOOST_FUSION_DEFINE_STRUCT macro invocation.

    I think, but I'm not knowledgeable enough to be sure, that there is a bug in BOOST_FUSION_ADAPT_STRUCT_NAMESPACE_DEFINITION_END. It uses BOOST_PP_REPEAT_1 directly when I think it should just use BOOST_PP_REPEAT. I have undefined and redefined that macro using BOOST_PP_REPEAT and everything seems to work, but you probably shouldn't trust it blindly.

    Test Running on WandBox

    define_xml_tags.hpp

    #include <boost/fusion/include/define_struct.hpp>
    #include <boost/variant.hpp>
    #include <vector>
    #include <utility>
    #include <boost/preprocessor/cat.hpp>
    #include <boost/preprocessor/repetition/repeat.hpp>
    #include <boost/preprocessor/seq/enum.hpp>
    #include <boost/preprocessor/seq/for_each.hpp>
    #include <boost/preprocessor/seq/for_each_i.hpp>
    #include <boost/preprocessor/seq/size.hpp>
    #include <boost/preprocessor/stringize.hpp>
    #include <boost/preprocessor/tuple/elem.hpp>
    
    //I think there is a bug in the original macro, it uses BOOST_PP_REPEAT_1 where I think it should use BOOST_PP_REPEAT, but I don't know enough to know for sure
    #undef BOOST_FUSION_ADAPT_STRUCT_NAMESPACE_DEFINITION_END
    
    #define BOOST_FUSION_ADAPT_STRUCT_NAMESPACE_DEFINITION_END(NAMESPACE_SEQ)       \
        BOOST_PP_REPEAT(                                                          \
            BOOST_PP_DEC(BOOST_PP_SEQ_SIZE(NAMESPACE_SEQ)),                         \
            BOOST_FUSION_ADAPT_STRUCT_NAMESPACE_END_I,                              \
            _)
    
    //helps form a SEQUENCE of TUPLES
    #define XML_TAG(NAME,MEMBER_SEQ) ((NAME,MEMBER_SEQ)) 
    
    //helpers for GENERATE_STRUCT_DEFS, read from the bottom to the top
    #define DO_GENERATE_ONE_STRUCT_DEF(NAME,MEMBER_SEQ) \
    BOOST_FUSION_DEFINE_STRUCT( (), NAME, (std::string, name) MEMBER_SEQ)
    
    #define GENERATE_ONE_STRUCT_DEF(Z,INDEX,TAG_SEQ) \
    DO_GENERATE_ONE_STRUCT_DEF(BOOST_PP_TUPLE_ELEM(2,0,BOOST_PP_SEQ_ELEM(INDEX,TAG_SEQ)), BOOST_PP_TUPLE_ELEM(2,1,BOOST_PP_SEQ_ELEM(INDEX,TAG_SEQ)))
    
    #define GENERATE_STRUCT_DEFS(TAG_SEQ) \
    BOOST_PP_REPEAT(BOOST_PP_SEQ_SIZE(TAG_SEQ),GENERATE_ONE_STRUCT_DEF,TAG_SEQ)
    
    
    //helpers for GENERATE_VARIANT_OF_TYPES, bottom to top
    #define GET_TYPE_SEQUENCE(R,DATA,NAME_MEMBERSEQ_TUPLE) \
    (BOOST_PP_TUPLE_ELEM(2,0,NAME_MEMBERSEQ_TUPLE))
    
    #define ENUMERATE_TYPES(TAG_SEQ) \
    BOOST_PP_SEQ_ENUM(BOOST_PP_SEQ_FOR_EACH(GET_TYPE_SEQUENCE,_,TAG_SEQ))
    
    #define GENERATE_VARIANT_OF_TYPES(TAG_SEQ) \
    typedef boost::variant<ENUMERATE_TYPES(TAG_SEQ)> Types;
    
    
    //helpers for GENERATE_XMLTAGS, go from bottom to top in order to understand
    
    //Heavily "inspired" from BOOST_FUSION_ADAPT_STRUCT
    #define GENERATE_NAME_SEQUENCE_FILLER_0(X, Y)  \
        ((X, Y)) GENERATE_NAME_SEQUENCE_FILLER_1
    #define GENERATE_NAME_SEQUENCE_FILLER_1(X, Y)  \
        ((X, Y)) GENERATE_NAME_SEQUENCE_FILLER_0
    #define GENERATE_NAME_SEQUENCE_FILLER_0_END
    #define GENERATE_NAME_SEQUENCE_FILLER_1_END
    
    #define GENERATE_MEMBER_NAME_SEQUENCE(R,DATA,INDEX,TYPE_NAME_TUPLE) (BOOST_PP_STRINGIZE(BOOST_PP_TUPLE_ELEM(2,1,TYPE_NAME_TUPLE)))
    
    #define GENERATE_VECTOR_OF_MEMBER_NAMES(MEMBER_SEQ) \
    { "name", BOOST_PP_SEQ_ENUM(BOOST_PP_SEQ_FOR_EACH_I(GENERATE_MEMBER_NAME_SEQUENCE,_,BOOST_PP_CAT(GENERATE_NAME_SEQUENCE_FILLER_0 MEMBER_SEQ,_END))) }
    
    #define GENERATE_ONE_PAIR(R,DATA,NAME_MEMBERSEQ_TUPLE) \
    { BOOST_PP_TUPLE_ELEM(2,0,NAME_MEMBERSEQ_TUPLE)(), GENERATE_VECTOR_OF_MEMBER_NAMES(BOOST_PP_TUPLE_ELEM(2,1,NAME_MEMBERSEQ_TUPLE)) },
    
    #define GENERATE_PAIRS(TAG_SEQ) \
    BOOST_PP_SEQ_FOR_EACH(GENERATE_ONE_PAIR,_,TAG_SEQ)
    
    #define GENERATE_XMLTAGS(TAG_SEQ) \
    const std::vector<std::pair<Types,std::vector<std::string>>> xmlTags = { GENERATE_PAIRS(TAG_SEQ) };
    
    
    //This is the actual macro, it simply invokes three different macros that do a different task each
    #define DEFINE_XML_TAGS(TAG_SEQ) \
    GENERATE_STRUCT_DEFS(TAG_SEQ) \
    GENERATE_VARIANT_OF_TYPES(TAG_SEQ) \
    GENERATE_XMLTAGS(TAG_SEQ)
    

    main.cpp

    #include <iostream>
    #include <boost/fusion/include/io.hpp>
    #include <boost/fusion/include/as_vector.hpp>
    #include <boost/variant/static_visitor.hpp>
    
    #include "define_xml_tags.hpp"
    
    
    
    DEFINE_XML_TAGS(
        XML_TAG(
            Person,
            (int, age)
        )
        XML_TAG(
            Company,
            (int, noEmployees)
            (std::string, location)
        )
    )
    
    struct printer : boost::static_visitor<void> {
        void operator()(const Person& p) const
        {
            std::cout << "This is a person:" << boost::fusion::as_vector(p) << '\n';
        }
    
        void operator()(const Company& c) const
        {
            std::cout << "This is a company:" << boost::fusion::as_vector(c) << '\n';
        }
    };
    
    void identify(Types v)
    {
        boost::apply_visitor(printer(),v);
    }
    
    
    int main() 
    {
        Person p;
        p.name="John";
        p.age = 18;
    
        identify(p);
    
        Company c;
        c.name="Mpany Co";
        c.noEmployees=123;
        c.location="Fake St";
        identify(c);
    
    
        std::cout << "\nChecking xmlTags:\n";
        for(const auto& pair : xmlTags)
        {
            identify(pair.first);
            std::cout << "It has the following members:\n";
            for(const auto& str : pair.second)
                std::cout << str << '\n';
        }
    
        std::cout << std::endl;
    }