Search code examples
c++jsonrecursionmemory-managementrapidjson

Recursive insert of a chain into memory fails


This meight be a long question but i hope someone can help me figuring out whats going wrong.

I am inserting a JSON Object into already allocated Memory with my own Datatype which basically holds a Union with Data and a ptrdiff_t to the next Datatype in 8bit steps.

template <typename T>
class BaseType
{
public:
    BaseType();
    explicit BaseType(T& t);
    explicit BaseType(const T& t);

    ~BaseType();
    inline void setNext(const ptrdiff_t& next);
    inline std::ptrdiff_t getNext();
    inline void setData(T& t);
    inline void setData(const T& t);
    inline T getData() const;

protected:
    union DataUnion
    {
        T data;
        ::std::ptrdiff_t size;

        DataUnion()
        {
            memset(this, 0, sizeof(DataUnion));
        } //init with 0
        explicit DataUnion(T& t);
        explicit DataUnion(const T& t);
    } m_data;

    long long m_next;
};

The implementation is streight so nothing special happes there just setting/getting the values of the definition. (i'll skip the impl. here)

So here starts the code where something goes wrong:

std::pair<void*, void*> Page::insertObject(const rapidjson::GenericValue<rapidjson::UTF8<>>& value,
         BaseType<size_t>* last)
 {
     //return ptr to the first element
     void* l_ret = nullptr;
     //prev element ptr
     BaseType<size_t>* l_prev = last;

     //position pointer
     void* l_pos = nullptr;
     //get the members
     for (auto it = value.MemberBegin(); it != value.MemberEnd(); ++it)
     {
         switch (it->value.GetType())
         {
             case rapidjson::kNullType:
                 LOG_WARN << "null type: " << it->name.GetString();
                 continue;

             case rapidjson::kFalseType:
             case rapidjson::kTrueType:
                 {
                     l_pos = find(sizeof(BaseType<bool>));

                     void* l_new = new (l_pos) BaseType<bool>(it->value.GetBool());

                     if (l_prev != nullptr)
                         l_prev->setNext(dist(l_prev, l_new));
                 }
                 break;
             case rapidjson::kObjectType:
                 {
                     //pos for the obj id
                     //and insert the ID of the obj
                     l_pos = find(sizeof(BaseType<size_t>));
                     std::string name = it->name.GetString();
                     void* l_new = new (l_pos) BaseType<size_t>(common::FNVHash()(name));

                     if (l_prev != nullptr)
                         l_prev->setNext(dist(l_prev, l_new));
                     //TODO something strange happens here!

                     // pass the objid Object to the insertobj!
                     // now recursive insert the obj
                     // the second contains the last element inserted
                     // l_pos current contains the last inserted element and get set to the
                     // last element of the obj we insert
                     l_pos = (insertObject(it->value, reinterpret_cast<BaseType<size_t>*>(l_new)).second);
                 }
                 break;

             case rapidjson::kArrayType:
                 {//skip this at the moment till the bug is fixed
                 }
                 break;

             case rapidjson::kStringType:
                 {
                     // find pos where the string fits
                     // somehow we get here sometimes and it does not fit!
                     // which cant be since we lock the whole page
                     l_pos = find(sizeof(StringType) + strlen(it->value.GetString()));

                     //add the String Type at the pos of the FreeType
                     auto* l_new = new (l_pos) StringType(it->value.GetString());
                     if (l_prev != nullptr)
                         l_prev->setNext(dist(l_prev, l_new));
                 }
                 break;

             case rapidjson::kNumberType:
                 {
                     //doesnt matter since long long and double are equal on x64
                     //find pos where the string fits
                     l_pos = find(sizeof(BaseType<long long>));

                     void* l_new;
                     if (it->value.IsInt())
                     {
                         //insert INT
                         l_new = new (l_pos) BaseType<long long>(it->value.GetInt64());
                     }
                     else
                     {
                         //INSERT DOUBLE
                         l_new = new (l_pos) BaseType<double>(it->value.GetDouble());
                     }
                     if (l_prev != nullptr)
                         l_prev->setNext(dist(l_prev, l_new));
                 }
                 break;
             default:
                 LOG_WARN << "Unknown member Type: " << it->name.GetString() << ":" << it->value.GetType();
                 continue;
         }
         //so first element is set now, store it to return it.
         if(l_ret == nullptr)
         {
             l_ret = l_pos;
         }
         //prev is the l_pos now so cast it to this;
         l_prev = reinterpret_cast<BaseType<size_t>*>(l_pos);
     }
     //if we get here its in!
     return{ l_ret, l_pos };
 }

I am starting to insert like this:

auto firstElementPos = insertObject(value.MemberBegin()->value, nullptr).first;

While value.MemberBegin()->value is Object to be inserted and ->name holds the Name of the object. In the case below its Person and everything between {}.

The problem is, if i insert a JSON Object which has one Object inside like so:

"Person":
{
    "age":25,
    "double": 23.23,
    "boolean": true,
    "double2": 23.23,
    "firstInnerObj":{
        "innerDoub": 12.12
    }   
}

It works properly and i can reproduce the Object. But if i have more inner objects like so:

"Person":
{
    "age":25,
    "double": 23.23,
    "boolean": true,
    "double2": 23.23,
    "firstInnerObj":{
        "innerDoub": 12.12
    },
    "secondInnerObj":{
        "secInnerDoub": 12.12
    }
}

It fails and i lose data so i think that my recursion goes wrong but i dont see why. If you need any more informations let me know. Meight take a look here and the client here.

The test.json need to contain a json object like above. And the find only need to contain {"oid__":2} to get the second object that was inserted.


I could track the issue down to the Point where i recreate the Object recursively in the code. Some of the Nextpointers seem to be incorrect:

    void* Page::buildObject(const size_t& hash, void* start, rapidjson::Value& l_obj,
                            rapidjson::MemoryPoolAllocator<>& aloc)
    {
        //get the meta information of the object type
        //to build it
        auto& l_metaIdx = meta::MetaIndex::getInstance();
        //get the meta dataset
        auto& l_meta = l_metaIdx[hash];

        //now we are already in an object here with l_obj!
        auto l_ptr = start;
        for (auto it = l_meta->begin(); it != l_meta->end(); ++it)
        {
            //create the name value
            rapidjson::Value l_name(it->name.c_str(), it->name.length(), aloc);
            //create the value we are going to add
            rapidjson::Value l_value;
            //now start building it up again
            switch (it->type)
            {
                case meta::OBJECT:
                    {
                        auto l_data = static_cast<BaseType<size_t>*>(l_ptr);
                        //get the hash to optain the metadata
                        auto l_hash = l_data->getData();
                        //set to object and create the inner object
                        l_value.SetObject();

                        //get the start pointer which is the "next" element
                        //and call recursive
                        l_ptr = static_cast<BaseType<size_t>*>(buildObject(l_hash,
                                                               (reinterpret_cast<char*>(l_data) + l_data->getNext()), l_value, aloc));
                    }
                    break;
                case meta::ARRAY:
                    {
                        l_value.SetArray();
                        auto l_data = static_cast<ArrayType*>(l_ptr);
                        //get the hash to optain the metadata
                        auto l_size = l_data->size();
                        l_ptr = buildArray(l_size, static_cast<char*>(l_ptr) + l_data->getNext(), l_value, aloc);
                    }
                    break;
                case meta::INT:
                    {
                        //create the data
                        auto l_data = static_cast<BaseType<long long>*>(l_ptr);
                        //with length attribute it's faster ;)
                        l_value = l_data->getData();
                    }
                    break;
                case meta::DOUBLE:
                    {
                        //create the data
                        auto l_data = static_cast<BaseType<double>*>(l_ptr);
                        //with length attribute it's faster ;)
                        l_value = l_data->getData();
                    }
                    break;
                case meta::STRING:
                    {
                        //create the data
                        auto l_data = static_cast<StringType*>(l_ptr);
                        //with length attribute it's faster
                        l_value.SetString(l_data->getString()->c_str(), l_data->getString()->length(), aloc);
                    }
                    break;
                case meta::BOOL:
                    {
                        //create the data
                        auto l_data = static_cast<BaseType<bool>*>(l_ptr);
                        l_value = l_data->getData();
                    }
                    break;
                default:
                    break;
            }
            l_obj.AddMember(l_name, l_value, aloc);
            //update the lptr
            l_ptr = static_cast<char*>(l_ptr) + static_cast<BaseType<size_t>*>(l_ptr)->getNext();
        }
        //return the l_ptr which current shows to the next lement. //see line above
        return l_ptr;
    }

Solution

  • After houers and houres of debugging i found the small issue which causes this. The method which builds up the Object after it was inserted returns a pointer to the actuall last element->next which was inserted and after the switch case i did call the ->next again which causes a loss of data because it scipped one element in the single chained list.

    The Fix to this is to put the line

    l_ptr = static_cast<char*>(l_ptr) + static_cast<BaseType<size_t>*>(l_ptr)->getNext();
    

    Only into the switch cases where it is not an Object or Array. Fix Commit This actually also gave me the fix for an Issue with inserting Array.

    Of cause the real issue could not know someone here who did not took a deep look into the code but i still want to show the fix here. Thanks to @sehe who helped alot with figuring out whats going wrong here.