Search code examples
c++boostmemory-mapped-filesboost-geometry

Segmentation fault when querying Rtree retrieved from memory mapped file


I am quite puzzled. Consider the following code, slightly adapted from http://www.boost.org/doc/libs/1_57_0/libs/geometry/doc/html/geometry/spatial_indexes/rtree_examples/index_stored_in_mapped_file_using_boost_interprocess.html :

#include <boost/filesystem.hpp>

#include <boost/geometry.hpp>
#include <boost/geometry/geometries/point.hpp>
#include <boost/geometry/geometries/box.hpp>
#include <boost/geometry/index/rtree.hpp>

#include <boost/interprocess/managed_mapped_file.hpp>

namespace bg = boost::geometry;
namespace bgi = boost::geometry::index;
namespace bi = boost::interprocess;

typedef bg::model::point<float, 2, bg::cs::cartesian> point; 
typedef std::pair<point, int> value_t; // **
typedef bgi::linear<32, 8> params_t;
typedef bgi::indexable<value_t> indexable_t;
typedef bgi::equal_to<value_t> equal_to_t;
typedef bi::allocator<value_t, bi::managed_mapped_file::segment_manager> allocator_t;
typedef bgi::rtree<value_t, params_t, indexable_t, equal_to_t, allocator_t> rtree_t;

using namespace boost::filesystem;

int main(int argc, char * argv[])
{   

    std::string indexFile = "/home/jerome/proteome/index_tree.dat";
    remove(indexFile); 

    int mmfSize = 1200000;

    {
        bi::managed_mapped_file file(bi::open_or_create,indexFile.c_str(), mmfSize);
        allocator_t alloc(file.get_segment_manager());
        rtree_t * rtree_ptr = file.find_or_construct<rtree_t>("rtree")(params_t(), indexable_t(), equal_to_t(), alloc);

        std::cout << "Indexing ... " << std::endl;
        for(int i = 0; i < 1001; i++)
        {
            rtree_ptr->insert(std::make_pair(point(i,i),i*i));  
        }

        std::cout << "Indexing done." << std::endl;
    }

    {
        bi::managed_mapped_file file(bi::open_or_create,indexFile.c_str(), mmfSize);
        allocator_t alloc(file.get_segment_manager());
        rtree_t * rtree_ptr = file.find_or_construct<rtree_t>("rtree")(params_t(), indexable_t(), equal_to_t(), alloc);

        std::cout << "Tree loaded, contains " << rtree_ptr->size() << " elements" << std::endl;

        // query point
        point pt(2, 1);

        std::vector<value_t> results;
        rtree_ptr->query(bgi::nearest(pt, 3), std::back_inserter(results));
        std::cout << "Query performed" << std::endl;    

        for (int i = 0; i < results.size(); i++)
        {
            value_t v = results[i];
            std::cout << "Found the point " << v.second << " at a distance of " << bg::distance(v.first,pt) << std::endl; 
        }
    }

}

It works great. It creates an Rtree and stores it in a memory mapped file, then retrieve it and query it, no problem. However, as soon as I try to split this file in two (where the tree is built in one file, and queried in another), the queries do not work anymore! (The "..." in the code below refers to all the includes and typedefs from the initial example, that have been copied exactly in the two files but are removed here for clarity).

Building file:

...
int main(int argc, char * argv[])
{   

    std::string indexFile = "/home/jerome/proteome/index_tree.dat";
    remove(indexFile); 

    int mmfSize = 1200000;

    {
        bi::managed_mapped_file file(bi::open_or_create,indexFile.c_str(), mmfSize);
        allocator_t alloc(file.get_segment_manager());
        rtree_t * rtree_ptr = file.find_or_construct<rtree_t>("rtree")(params_t(), indexable_t(), equal_to_t(), alloc);

        std::cout << "Indexing ... " << std::endl;
        for(int i = 0; i < 1001; i++)
        {
            rtree_ptr->insert(std::make_pair(point(i,i),i*i));  
        }

        std::cout << "Indexing done." << std::endl;
    }

}

Query file:

...

int main(int argc, char * argv[])
{   

    std::string indexFile = "/home/jerome/proteome/index_tree.dat";

    int mmfSize = 1200000;

    {
        bi::managed_mapped_file file(bi::open_or_create,indexFile.c_str(), mmfSize);
        allocator_t alloc(file.get_segment_manager());
        rtree_t * rtree_ptr = file.find_or_construct<rtree_t>("rtree")(params_t(), indexable_t(), equal_to_t(), alloc);

        std::cout << "Tree loaded, contains " << rtree_ptr->size() << " elements" << std::endl;

        // query point
        point pt(2, 1);

        std::vector<value_t> results;
        rtree_ptr->query(bgi::nearest(pt, 3), std::back_inserter(results));
        std::cout << "Query performed" << std::endl;    

        for (int i = 0; i < results.size(); i++)
        {
            value_t v = results[i];
            std::cout << "Found the point " << v.second << " at a distance of " << bg::distance(v.first,pt) << std::endl; 
        }
    }

}

(The remove() is there to prevent overwriting an existing file and start fresh each time.)

The building code works fine, but the query code fails:

Tree loaded, contains 1001 elements Segmentation fault (core dumped)

Any ideas? I expect that somehow, when the retrieval of the tree is done, something is missing so the retrieved tree is malformed and leads to memory issues when queried. But why would it happen when it's in two different files, and not when it's in the same file but in two different scopes? Shouldn't it have exactly the same behaviour?

Edit: I was using boost 1.54.


Solution

  • Internally the R-tree can use various types of nodes, though the interface of defining and choosing them is not documented and probably will never be. In Boost 1.56 the default type of nodes was changed to variant-based exactly because of the issue you're facing.

    So to use the rtree with Interprocess without problems you could:

    See also this discussion: http://boost-geometry.203548.n3.nabble.com/rtree-crash-when-used-with-inter-process-td4026037.html

    At the end of the above discussion there is one more solution mentioned but it's more complicated and depends on the internals of the library. It might stop compiling at some point (in fact it should work only for Boost 1.56 and below). But if you used it, your program would require only the official Boost to compile, without any modifications.