Search code examples
c++c++11boostboost-spirit-karma

Boost karma: how does this implicit call to transform_attribute work? (or doesn't?)


I have the following piece of code that seems to work fine (I based the semantic actions on reuse parsed variable with boost karma).

#include <iostream>
#include <iterator>
#include <memory>
#include <string>
#include <vector>

#include <boost/fusion/include/adapt_struct.hpp>
#include <boost/fusion/include/sequence.hpp>
#include <boost/spirit/include/karma.hpp>
#include <boost/spirit/include/phoenix_core.hpp>
#include <boost/spirit/include/phoenix_operator.hpp>
#include <boost/spirit/include/phoenix_fusion.hpp>
#include <boost/spirit/include/phoenix_bind.hpp>
#include <boost/spirit/include/support_attributes.hpp>
#include <boost/spirit/include/support_adapt_adt_attributes.hpp>

using namespace boost::spirit;

struct DataElement
{
  DataElement(const std::string& s) : str_(s) {}

  const std::string& str() const { return str_; }
  std::string& str() { return str_; }
  std::string str_;
};
using Data = std::vector<std::shared_ptr<const DataElement>>;

namespace boost {
  namespace spirit {
    namespace traits {

      template<>
      struct transform_attribute<std::shared_ptr<const DataElement> const, const DataElement&, karma::domain>
      {
        using type = const DataElement&;
        static type pre(const std::shared_ptr<const DataElement>& val) { return *val; }
      };

    }
  }
}

BOOST_FUSION_ADAPT_ADT(
  DataElement,
  (std::string&, const std::string&, obj.str(), obj.str())
  );

template<typename Iterator>
struct TheGrammar: public karma::grammar<Iterator, Data()>
{
  TheGrammar(): karma::grammar<Iterator, Data()>(start)
  {
    start %= -(elt % karma::eol);
    elt %=
      karma::lit("'some prefix'")
      << karma::string [karma::_1 = boost::phoenix::at_c<0>(karma::_val)]
      << karma::lit("'some infix 1'")
      << karma::string [karma::_1 = boost::phoenix::at_c<0>(karma::_val)]
      << karma::lit("'some infix 2'")
      << karma::string [karma::_1 = boost::phoenix::at_c<0>(karma::_val)]
      << karma::lit("'some suffix'")
      ;
  }

  karma::rule<Iterator, Data()> start;
  karma::rule<Iterator, const DataElement&()> elt;
};

int main(void)
{
  Data vec = {
    std::make_shared<DataElement>("one"),
    std::make_shared<DataElement>("two"),
    std::make_shared<DataElement>("three"),
    std::make_shared<DataElement>("four"),
    std::make_shared<DataElement>("five"),
    std::make_shared<DataElement>("six"),
    std::make_shared<DataElement>("seven"),
    std::make_shared<DataElement>("eight"),
  };
  using iterator_type = std::ostream_iterator<char>;
  iterator_type out(std::cout);

  TheGrammar<iterator_type> grammar;
  return karma::generate(out, grammar, vec);
}

I would like to understand a couple of things:

  1. Why don't I need to use karma::attr_cast anywhere? My start rule is a vector of std::shared_ptr whereas the elt rule works on the actual object const reference. I originally tried attr_cast but got nowhere, and sort of tried this version only halfheartedly just in case it worked, and it worked...
  2. Why does it still compile if I comment out my custom transform_attribute altogether? Is there some default std::shared_ptr<T> -> T& transform_attribute provided? I couldn't find much, but maybe I'm not looking int the right place?
  3. If I comment out my custom transform_attribute, as mentioned above, the code still compiled, but there's clearly some memory corruption at runtime. The karma::string generate garbage. In a way, I can understand that something funny must be happening since I don't even tell karma how to get from my shared_ptr to the objects. Is the fact that it compiles the actual error/bug?

Thanks a lot for your time and help!


Solution

    1. each rule has an implicit attr_cast to the declared attribute type
    2. Somehwere along the way Spirit's type compatibility rules go haywire. All I've seen is it has to do with the fact that the string type is a container. Somewhere along the way it "copy-constructs" a std::string that appears to have length 97332352. Unsurprisingly that is itself wrong and happens to trigger UB because the ranges that end up being passed to memset overlap:

      Source and destination overlap in memcpy(0x60c1040, 0x5cd2c90, 97332352)
         at 0x4C30573: memcpy@@GLIBC_2.14 (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
         by 0x401B26: copy (char_traits.h:290)
         by 0x401B26: _S_copy (basic_string.h:299)
         by 0x401B26: _S_copy_chars (basic_string.h:342)
         by 0x401B26: void std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*>(char*, char*, std::forward_iterator_tag) [clone .isra.53] (basic_string.tcc:229)
         by 0x402442: _M_construct_aux<char*> (basic_string.h:195)
         by 0x402442: _M_construct<char*> (basic_string.h:214)
         by 0x402442: basic_string (basic_string.h:401)
         by 0x402442: call<const boost::spirit::unused_type> (extract_from.hpp:172)
         by 0x402442: call<const boost::spirit::unused_type> (extract_from.hpp:184)
         by 0x402442: extract_from<std::__cxx11::basic_string<char>, boost::fusion::extension::adt_attribute_proxy<DataElement, 0, true>, const boost::spirit::unused_type> (extract_from.hpp:217)
         by 0x402442: extract_from<std::__cxx11::basic_string<char>, boost::fusion::extension::adt_attribute_proxy<DataElement, 0, true>, const boost::spirit::unused_type> (extract_from.hpp:237)
         by 0x402442: pre (attributes.hpp:23)
      
    3. Yes, that's a QoI issue.

      The problem often is with c++'s implicit conversions. Pointer types have many unexpected conversions. Shared pointers do have their contextual conversion to bool.

    More notes:

    1. Your fusion adaptation seemed flawed: val was not being used in the setter

      BOOST_FUSION_ADAPT_ADT(DataElement, (std::string &, const std::string &, obj.str(), obj.str() = val))
      
    2. You're doing many things I've learned to avoid.