Search code examples
c++multithreadingboostboost-propertytree

Boost Property Tree fails to retrieve simple JSON in multi-threaded context


I am trying to parse a simple JSON string using Boost.PropertyTree in my C/C++ application.

{"header":{"version":42,"source":1,"destination":2},"coffee":"colombian"}

Here is how I've set it up in my C/C++ multi-threaded application (manually defining the JSON string to demonstrate the issue).

ParseJson.cpp

#ifdef __cplusplus
extern "C"
{
#endif

#include "ParseJson.hpp"

#ifdef __cplusplus
}
#endif

#include <iostream>
#include <sstream>
#include <string>

#include <boost/property_tree/ptree.hpp>
#include <boost/property_tree/json_parser.hpp>

using boost::property_tree::ptree;
using boost::property_tree::read_json;
using boost::property_tree::write_json;

extern "C" MyStruct * const parseJsonMessage(char * jsonMessage, unsigned int const messageLength) {
    MyStruct * myStruct = new MyStruct();
    // Create empty property tree object.
    ptree tree;

    if (myStruct != nullptr) {
        try {
            // Create an istringstream from the JSON message.
            std::string jsonMessageString("{\"header\":{\"version\":42,\"source\":1,\"destination\":2},\"coffee\":\"colombian\"}");   // doesn't work
            std::istringstream isStreamJson(jsonMessageString);

            // Parse the JSON into the property tree.
            std::cout << "Reading JSON ..." << jsonMessageString << "...";
            read_json(isStreamJson, tree);
            std::cout << " Done!" << std::endl;

            // Get the values from the property tree.
            printf("version: %d\n", tree.get<int>("header.version"));
            printf("source: %d\n", tree.get<int>("header.source"));
            printf("coffee: %s\n", tree.get<std::string>("coffee").c_str());
        }
        catch (boost::property_tree::ptree_bad_path badPathException) {
            std::cout << "Exception caught for bad path: " << badPathException.what() << std::endl;
            return nullptr;
        }
        catch (boost::property_tree::ptree_bad_data badDataException) {
            std::cout << "Exception caught for bad data: " << badDataException.what() << std::endl;
            return nullptr;
        }
        catch (std::exception exception) {
            std::cout << "Exception caught when parsing message into Boost.Property tree: " << exception.what() << std::endl;
            return nullptr;
        }
    }
    return myStruct;
}

The read_json() call appears to complete but the get() calls to retrieve the parsed data from the property tree fail:

Reading JSON ...{"header":{"version":42,"source":1,"destination":2},"coffee":"colombian"}... Done!
Exception caught for bad path: No such node (header.version)

I am using Boost 1.53 on RHEL 7 (compiler is gcc/g++ version 4.8.5), and I've tried both suggestions mentioned in this post related to Boost.PropertyTree and multi-threading. I've defined the BOOST_SPIRIT_THREADSAFE compile definition globally for the project. I also tried the atomic swap solution suggested for that post. Neither of these had any effect on the symptoms.

Oddly enough, I can use the other public methods for Boost.Property tree to grab the values manually:

std::cout << "front.key: " << tree.front().first << std::endl;
std::cout << "front.front.key: " << tree.front().second.front().first << std::endl;
std::cout << "front.front.value: " << tree.front().second.front().second.get_value_optional<std::string>() << std::endl;

which shows the JSON was actually parsed:

front.key: header
front.front.key: version
front.front.value:  42

Note, I had to use std::string to grab the header.version value, as trying to use get_value_optional<int>() crashes also.

However, this manual approach is not scalable; my application needs to accept several, more complicated JSON structures.

When I tried more complicated JSON strings, these were also successfully parsed, but accessing values using the get() methods similarly failed, this time crashing the program. Here is one of the GDB backtraces I pulled from the crashes, but I wasn't familiar enough with Boost to get anything useful from this:

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffebfff700 (LWP 7176)]
0x00007ffff5aa8200 in std::locale::locale(std::locale const&) () from /lib64/libstdc++.so.6
Missing separate debuginfos, use: debuginfo-install boost-system-1.53.0-28.el7.x86_64 boost-thread-1.53.0-28.el7.x86_64 bzip2-libs-1.0.6-13.el7.x86_64 elfutils-libelf-0.176-5.el7.x86_64 elfutils-libs-0.176-5.el7.x86_64 glibc-2.17-292.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-37.el7_7.2.x86_64 libattr-2.4.46-13.el7.x86_64 libcap-2.22-10.el7.x86_64 libcom_err-1.42.9-16.el7.x86_64 libgcc-4.8.5-39.el7.x86_64 libselinux-2.5-14.1.el7.x86_64 libstdc++-4.8.5-39.el7.x86_64 openssl-libs-1.0.2k-19.el7.x86_64 pcre-8.32-17.el7.x86_64 systemd-libs-219-67.el7_7.2.x86_64 xz-libs-5.2.2-1.el7.x86_64 zlib-1.2.7-18.el7.x86_64
(gdb) bt
#0  0x00007ffff5aa8200 in std::locale::locale(std::locale const&) () from /lib64/libstdc++.so.6
#1  0x00007ffff5ab6051 in std::basic_ios<char, std::char_traits<char> >::imbue(std::locale const&) () from /lib64/libstdc++.so.6
#2  0x000000000041e322 in boost::property_tree::stream_translator<char, std::char_traits<char>, std::allocator<char>, int>::get_value(std::string const&) ()
#3  0x000000000041c5b2 in boost::optional<int> boost::property_tree::basic_ptree<std::string, std::string, std::less<std::string> >::get_value_optional<int, boost::property_tree::stream_translator<char, std::char_traits<char>, std::allocator<char>, int> >(boost::property_tree::stream_translator<char, std::char_traits<char>, std::allocator<char>, int>) const ()
#4  0x000000000041aa61 in boost::enable_if<boost::property_tree::detail::is_translator<boost::property_tree::stream_translator<char, std::char_traits<char>, std::allocator<char>, int> >, int>::type boost::property_tree::basic_ptree<std::string, std::string, std::less<std::string> >::get_value<int, boost::property_tree::stream_translator<char, std::char_traits<char>, std::allocator<char>, int> >(boost::property_tree::stream_translator<char, std::char_traits<char>, std::allocator<char>, int>) const ()
#5  0x000000000041985d in int boost::property_tree::basic_ptree<std::string, std::string, std::less<std::string> >::get_value<int>() const ()
#6  0x0000000000418673 in int boost::property_tree::basic_ptree<std::string, std::string, std::less<std::string> >::get<int>(boost::property_tree::string_path<std::string, boost::property_tree::id_translator<std::string> > const&) const ()
#7  0x0000000000414f4a in parseJsonMessage ()
#8  0x000000000040d8cd in ProcessThread () at ../../src/Processing.c:906
#9  0x00007ffff7bc6ea5 in start_thread () from /lib64/libpthread.so.0
#10 0x00007ffff55538cd in clone () from /lib64/libc.so.6

FWIW, I tried putting this code into a simple (single-threaded) main.cpp:

#include <iostream>
#include <sstream>
#include <string>

#include <boost/property_tree/ptree.hpp>
#include <boost/property_tree/json_parser.hpp>

using boost::property_tree::ptree;
using boost::property_tree::read_json;
using boost::property_tree::write_json;

int main(int numArgs, char * const * const args) {
    
    ptree tree;

    try {
        // Create an istringstream from the JSON message.
        std::string jsonMessageString("{\"header\":{\"version\":42,\"source\":1,\"destination\":2},\"coffee\":\"colombian\"}");
        std::istringstream isStreamJson(jsonMessageString);

        // Parse the JSON into the property tree.
        std::cout << "Reading JSON..." << jsonMessageString << "...";
        read_json(isStreamJson, tree);
        std::cout << " Done!" << std::endl;
        // Print what we parsed.
        std::cout << "version: " << tree.get<int>("header.version") << std::endl;
        std::cout << "source: " << tree.get<int>("header.source") << std::endl;
        std::cout << "coffee: " << tree.get<std::string>("coffee") << std::endl;
    }
    catch (boost::property_tree::ptree_bad_path badPathException) {
        std::cout << "Exception caught for bad path: " << badPathException.what() << std::endl;
        return -1;
    }
    catch (boost::property_tree::ptree_bad_data badDataException) {
        std::cout << "Exception caught for bad data: " << badDataException.what() << std::endl;
        return -1;
    }
    catch (std::exception exception) {
        std::cout << "Exception caught when parsing message into Boost.Property tree: " << exception.what() << std::endl;
        return -1;
    }
    std::cout << "Program completed!" << std::endl;
    return 0;
}

This code works just fine:

bash-4.2$ g++ -std=c++11 main.cpp -o main.exe
bash-4.2$ ./main.exe 
Reading JSON...{"header":{"version":42,"source":1,"destination":2},"coffee":"colombian"}... Done!
version: 42
source: 1
coffee: colombian
Program completed!

So, why won't the Boost.PropertyTree get() methods work for a multi-threaded application? Could the fact that the multi-threaded application is a mix of C and C++ code be causing an issue? I see that my specific compiler version (GCC 4.8.5) hasn't been explicitly verified with this Boost library... Could this be a compiler issue? Or is the Boost 1.53 version buggy?


An update based on provided answer:

Admittedly, my original code for the parseJsonMessage method was messy (a product of dozens of debugging iterations and ripping out code that wasn't relevant to the problem). A more condensed version that is free of distractions (and possible red herrings) is the following:

#ifdef __cplusplus
extern "C"
{
#endif

#include "DirectIpRev3.hpp"

#ifdef __cplusplus
}
#endif

#include <iostream>
#include <sstream>
#include <string>

#include <boost/property_tree/ptree.hpp>
#include <boost/property_tree/json_parser.hpp>

using boost::property_tree::ptree;
using boost::property_tree::read_json;
using boost::property_tree::write_json;

extern "C" void parseJsonMessage2() {
    // Create empty property tree object.
    ptree tree;
    std::string jsonMessageString("{\"header\":{\"version\":42,\"source\":1,\"destination\":2},\"coffee\":\"colombian\"}");   //doesn't work
    std::istringstream isStreamJson(jsonMessageString);
    try {
        read_json(isStreamJson, tree);
        std::cout << tree.get<int>("header.version") << std::endl;
        std::cout << tree.get<int>("header.source") << std::endl;
        std::cout << tree.get<std::string>("coffee") << std::endl;
    }
    catch (boost::property_tree::ptree_bad_path const & badPathException) {
        std::cerr << "Exception caught for bad path: " << badPathException.what() << std::endl;
    }
    catch (boost::property_tree::ptree_bad_data const & badDataException) {
        std::cerr << "Exception caught for bad data: " << badDataException.what() << std::endl;
    }
    catch (std::exception const & exception) {
        std::cerr << "Exception caught when parsing message into Boost.Property tree: " << exception.what() << std::endl;
    }
}

Running this condensed function in my multi-threaded program yields an exception:

Exception caught when parsing message into Boost.Property tree: <unspecified file>(1): expected object or array

Without the exception handling, it prints a bit more info:

terminate called after throwing an instance of 'boost::exception_detail::clone_impl<boost::exception_detail::error_info_injector<boost::property_tree::json_parser::json_parser_error> >'
  what():  <unspecified file>(1): expected object or array

I'm still not too sure what might be causing the failure here, but leaning towards using nlohmann as suggested.


Solution

  • Please, don't use Property Tree to "parse" "JSON". See nlohmann or Boost.JSON

    Further

    • you're using raw new and delete apparently without good reason
    • You have unused parameters
    • you're catching polymorphic exceptions by value
    • you're leaking memory on any exception and returning null pointers when there are allocation errors

    Combining these, I'm 99% certain that your crashes are caused by something else: Undefined Behaviour has a tendency to show up elsewhere after memory corruption (e.g. stack thrashing or use-after-delete, out-of-bounds etc.).

    Using My Crystal Ball

    1. A guess: you didn't show, but the struct probably looks like

      typedef struct MyStructT {
          int version;
          int source;
          char const* coffee;
      } MyStruct;
      

      A naive mistake would be to assign coffee the same way you print it:

      myStruct->coffee = tree.get<std::string>("coffee").c_str();
      

      The "obvious"(?) problem here is that c_str() points to memory owned by the value node, and transitively by the ptree. When the function returns that pointer is stale. OOPS. UB

    2. You are allocating the struct with new (even though it is likely POD because of the extern "C", so it gives you a false sense of security, since all the members have indeterminate values anyways).

      Another naive mistake would be to pass that C code that de-allocates with ::free (like it does with everything malloc-ed, right). That's another potential source of UB.

    3. If you had "nailed" the first idea e.g. with strdup you might run into problems with more leaked memory. Even if you use delete myStruct correctly (or start using malloc instead), you will have to remember to ::free the string allocated with strdup.

    4. Your API is typical C-style (which is probably on purpose) but leaves the door wide open to passing the wrong messageLength causing out-of-bounds read. The chance of this happening is raised due to the observation that you're not even using the arguments in your own example code above.

    MULTI-THREADED STRESS TEST

    Here's a multi-threaded stress test live on Coliru. It does 1000 iterations on 25 threads.

    Live On Coliru

    #ifdef __cplusplus
    extern "C"
    {
    #endif
    
    typedef struct MyStructT {
        int version;
        int source;
        char* coffee;
    } MyStruct;
    
    //#include "ParseJson.hpp"
    
    #ifdef __cplusplus
    }
    #endif
    
    #include <iostream>
    #include <sstream>
    #include <string>
    
    #define BOOST_BIND_GLOBAL_PLACEHOLDERS
    #include <boost/property_tree/ptree.hpp>
    #include <boost/property_tree/json_parser.hpp>
    
    using boost::property_tree::ptree;
    using boost::property_tree::read_json;
    using boost::property_tree::write_json;
    
    extern "C" MyStruct* parseJsonMessage(char const* jsonMessage, unsigned int const messageLength) {
        auto myStruct = std::make_unique<MyStruct>(); // make it exception safe
        // Create empty property tree object.
        ptree tree;
    
        if (myStruct != nullptr) {
            try {
                // Create an istringstream from the JSON message.
                std::istringstream isStreamJson(std::string(jsonMessage, messageLength));
    
                // Parse the JSON into the property tree.
                //std::cout << "Reading JSON ..." << isStreamJson.str() << "...";
                read_json(isStreamJson, tree);
                //std::cout << " Done!" << std::endl;
    
                // Get the values from the property tree.
                myStruct->version = tree.get<int>("header.version");
                myStruct->source = tree.get<int>("header.source");
                myStruct->coffee = ::strdup(tree.get<std::string>("coffee").c_str());
                return myStruct.release();
            }
            catch (boost::property_tree::ptree_bad_path const& badPathException) {
                std::cerr << "Exception caught for bad path: " << badPathException.what() << std::endl;
            }
            catch (boost::property_tree::ptree_bad_data const& badDataException) {
                std::cerr << "Exception caught for bad data: " << badDataException.what() << std::endl;
            }
            catch (std::exception const& exception) {
                std::cerr << "Exception caught when parsing message into Boost.Property tree: " << exception.what() << std::endl;
            }
        }
        return nullptr;
    }
    
    #include <cstdlib>
    #include <string>
    #include <thread>
    #include <list>
    
    int main() {
        static std::string_view msg = R"({"header":{"version":42,"source":1,"destination":2},"coffee":"colombian"})";
    
        auto task = [] {
            for (auto i = 1000; --i;) {
                auto s = parseJsonMessage(msg.data(), msg.size());
    
                ::printf("version: %d\n", s->version);
                ::printf("source: %d\n", s->source);
                ::printf("coffee: %s\n", s->coffee);
    
                ::free(s->coffee);
                delete s; // not ::free!
            }
        };
    
        std::list<std::thread> pool;
    
        for (int i = 0; i < 25; ++i)
            pool.emplace_back(task);
    
        for (auto& t : pool)
            t.join();
    }
    

    The output (sorted and uniq-ed):

      24975 coffee: colombian
      24975 source: 1
      24975 version: 42