Search code examples
c++boostruntime-errorboost-program-optionsany

Why does boost::any exhibit undefined behaviour in boost::program_options?


Let's take an example directly from boost's documentation:

#include <iostream>
#include <string>
#include <boost/program_options.hpp>

int main(int const ac, char** const av){

  // Declare the supported options.
  namespace po = boost::program_options;
  using namespace std;
  po::options_description desc("Allowed options");
  desc.add_options()
      ("help", "produce help message")
      ("compression", po::value<int>(), "set compression level")
  ;

  po::variables_map vm;
  po::store(po::parse_command_line(ac, av, desc), vm);
  po::notify(vm);    

  if (vm.count("help")) {
      cout << desc << "\n";
      return 1;
  }

  if (vm.count("compression")) {
      cout << "Compression level was set to " 
   << vm["compression"].as<int>() << ".\n";
  } else {
      cout << "Compression level was not set.\n";
  }
}

The program behaves correctly.
However, when compiled with gcc's sanitizer (or clang's):

g++ -std=c++1z -o main main.cpp -fsanitize=undefined -lboost_program_options

It produces the following runtime error:

./main --compression="1"                                                                                                                                                          134
/usr/include/boost/any.hpp:243:16: runtime error: downcast of address 0x000001153fb0 which does not point to an object of type 'holder'
0x000001153fb0: note: object is of type 'boost::any::holder<int>'
 00 00 00 00  20 bc 42 00 00 00 00 00  01 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  31 00 00 00
              ^~~~~~~~~~~~~~~~~~~~~~~
              vptr for 'boost::any::holder<int>'
Compression level was set to 1.

I've distilled the problem to something smaller:

#include <iostream>
#include <string>
#include <boost/program_options.hpp>

int main(int const argc, char** const argv){

  using namespace boost::program_options;

  //create description
  options_description desc("");

  //add entry
  desc.add_options()
  ("foo",value<std::string>(),"desc");

  //create variable map
  variables_map vm;

  //store variables in map
  positional_options_description pod;
    store(command_line_parser(argc, argv).options(desc).positional(pod).run(), vm);
    notify(vm);

    //get variable out of map
    std::string foo;
    if (vm.count("foo")){
        foo = vm["foo"].as<std::string>(); //UNDEFINED BEHAVIOUR
    }
}

compiled with:

g++ -std=c++1z -o main main.cpp -fsanitize=undefined -lboost_program_options

when executed:

./main --foo="hello"
/usr/include/boost/any.hpp:243:16: runtime error: downcast of address 0x000000d85fd0 which does not point to an object of type 'holder'
0x000000d85fd0: note: object is of type 'boost::any::holder<std::string>'
 00 00 00 00  b0 c5 5e 90 f8 7f 00 00  98 5f d8 00 00 00 00 00  00 00 00 00 00 00 00 00  31 00 00 00
              ^~~~~~~~~~~~~~~~~~~~~~~
              vptr for 'boost::any::holder<std::string>'

Clearly it is the cast out of the variable map that's causing the UB:

vm["foo"].as<std::string>()  

This is exactly how the online documentation shows it though.

Is this a false positive? Is there a bug in my boost distribution?
How can I avoid the sanitizer from flagging this if it is indeed safe?


Solution

  • It seems to be really an undefined behaviour. This code illustrates the issue:

    #include <boost/any.hpp>
    
    int main()
    {
        int value = 0;
        int const& const_ref = value;
        boost::any any_var {const_ref};
        boost::any_cast<int&>(any_var); // ubsan error
    }
    

    Here any_var is constructed with a const value and accessed as non-const int. Running this code with sanitizer raises runtime error similar to yours:

    /usr/local/include/boost/any.hpp:259:16: runtime error: downcast of address 0x60200000eff0 which does not point to an object of type 'any::holder<int>'
    0x60200000eff0: note: object is of type 'boost::any::holder<int const>'
     01 00 00 0c  b0 ee 49 00 00 00 00 00  00 00 00 00 be be be be  00 00 00 00 00 00 00 00  00 00 00 00
                  ^~~~~~~~~~~~~~~~~~~~~~~
                  vptr for 'boost::any::holder<int const>'
    SUMMARY: AddressSanitizer: undefined-behavior /usr/local/include/boost/any.hpp:259:16 in 
    /usr/local/include/boost/any.hpp:259:73: runtime error: member access within address 0x60200000eff0 which does not point to an object of type 'any::holder<int>'
    0x60200000eff0: note: object is of type 'boost::any::holder<int const>'
     01 00 00 0c  b0 ee 49 00 00 00 00 00  00 00 00 00 be be be be  00 00 00 00 00 00 00 00  00 00 00 00
                  ^~~~~~~~~~~~~~~~~~~~~~~
                  vptr for 'boost::any::holder<int const>'
    SUMMARY: AddressSanitizer: undefined-behavior /usr/local/include/boost/any.hpp:259:73 in 
    

    The problem is that the any_cast<int&> in the code attempts to access the stored value by downcasting type-erased pointer to any::holder<int>, but the actual type is any::holder<int const>. Hence the undefined behaviour.

    The undefined behaviour in boost::program_options

    In boost::program_options, a value of type T is stored as an any object in the typed_value<T> class. The any object is constructed like this:

    // In class typed_value
    typed_value* implicit_value(const T &v)
    {
        m_implicit_value = boost::any(v);
        m_implicit_value_as_text =
            boost::lexical_cast<std::string>(v);
        return this;
    }
    

    Note that the value v is declared as const reference. However, typed_value<T>::notify() (which is called from po::notify() in your code) accesses the stored value without const:

    template<class T, class charT>
    void
    typed_value<T, charT>::notify(const boost::any& value_store) const
    {
        const T* value = boost::any_cast<T>(&value_store);
        ...
    }
    

    This causes the undefined behaviour.

    Workaround

    In boost/program_options/value_semantic.hpp, change the following line of the implicit_value() function

    m_implicit_value = boost::any(v);
    

    to

    m_implicit_value = boost::any(T(v));
    

    This makes the sanitizer happy. I'm not sure if this is a real fix though.