Search code examples
c++memory-mapped-filesreinterpret-caststdtuple

Can I placement new a std::tuple into a memory mapped region, and read it back later?


I have some packed structs which I will be writing to a memory mapped file. They are all POD.

To accommodate some generic programming I'm doing, I want to be able to write a std::tuple of several packed structs.

I'm worried that writing the members of a std::tuple to my mapped region's address, and then later casting that address back to a std::tuple is going to break.

I've written a small examplar program, and it does seem to work, but I'm worried that I have undefined behaviour.

Here are my structs:

struct Foo
{
    char    c;    
    uint8_t pad[3];
    int     i;                   
    double  d;                   

} __attribute__((packed));

struct Bar
{
    int     i;                   
    char    c;                   
    uint8_t pad[3];
    double  d;                   

} __attribute__((packed));

I define a std::tuple of these structs:

using Tup = std::tuple<Foo, Bar>;

To simulate the memory mapped file I have created a small object with some inline storage and a size:

When adding a tuple it uses placement new to construct the tuple in the inline storage.

struct Storage
{
    Tup& push_back(Tup&& t)
    {
        Tup* p = reinterpret_cast<Tup*>(buf) + size;
        new (p) Tup(std::move(t));

        size += 1;

        return *p;
    }

    const Tup& get(std::size_t i) const
    {
        const Tup* p = reinterpret_cast<const Tup*>(buf) + i;
        return *p;
    }

    std::size_t  size = 0;
    std::uint8_t buf[100];
};

To simulate writing to a file and then reading it again I create one Storage object, populate it, copy it, and then let the original go out of scope.

Storage s2;

// scope of s1
{
    Storage s1;

    Tup t1 = { Foo { 'a', 1, 2.3 }, Bar { 2, 'b', 3.4 } };
    Tup t2 = { Foo { 'c', 3, 5.6 }, Bar { 4, 'd', 7.8 } };

    Tup& s1t1 = s1.push_back(std::move(t1));
    Tup& s1t2 = s1.push_back(std::move(t2));

    std::get<0>(s1t1).c = 'x';
    std::get<1>(s1t2).c = 'z';

    s2 = s1;
}

I then read my tuples using Storage::get which just does a reinterpret_cast<Tup&> of the inline storage.

const Tup& s2t1 = s2.get(0);

When I access the structs within the tuple they have the correct values.

In addition, running through valgrind doesn't throw up any errors.

  • Is what I'm doing defined behaviour?
  • Is it safe to reinterpret_cast from my inline storage to std::tuple if the tuple was originally placement newed there (into a file which will be closed and then later remapped and reread)?

Memory mapped file:

The actual storage I use is a struct cast onto a boost::mapped_region.

The struct is:

struct Storage
{
    std::size_t  size;
    std::uint8_t buf[1]; // address of buf is beginning of Tup array
};

I cast it as follows:

boost::mapped_region region_ = ...;
Storage* storage = reinterpret_cast<Storage*>(region_.get_address());

Will the alignment issues mentioned in answers below be a problem?

Full example below:

#include <cassert>
#include <cstdint>
#include <tuple>

struct Foo
{
    char    c;    
    uint8_t pad[3];
    int     i;                   
    double  d;                   

} __attribute__((packed));

struct Bar
{
    int     i;                   
    char    c;                   
    uint8_t pad[3];
    double  d;                   

} __attribute__((packed));

using Tup = std::tuple<Foo, Bar>;

struct Storage
{
    Tup& push_back(Tup&& t)
    {
        Tup* p = reinterpret_cast<Tup*>(buf) + size;
        new (p) Tup(std::move(t));

        size += 1;

        return *p;
    }

    const Tup& get(std::size_t i) const
    {
        const Tup* p = reinterpret_cast<const Tup*>(buf) + i;
        return *p;
    }

    std::size_t  size = 0;
    std::uint8_t buf[100];
};

int main ()
{
    Storage s2;

    // scope of s1
    {
        Storage s1;

        Tup t1 = { Foo { 'a', 1, 2.3 }, Bar { 2, 'b', 3.4 } };
        Tup t2 = { Foo { 'c', 3, 5.6 }, Bar { 4, 'd', 7.8 } };

        Tup& s1t1 = s1.push_back(std::move(t1));
        Tup& s1t2 = s1.push_back(std::move(t2));

        std::get<0>(s1t1).c = 'x';
        std::get<1>(s1t2).c = 'z';

        s2 = s1;
    }

    const Tup& s2t1 = s2.get(0);
    const Tup& s2t2 = s2.get(1);

    const Foo& f1 = std::get<0>(s2t1);
    const Bar& b1 = std::get<1>(s2t1);

    const Foo& f2 = std::get<0>(s2t2);
    const Bar& b2 = std::get<1>(s2t2);

    assert(f1.c == 'x');
    assert(f1.i == 1);
    assert(f1.d == 2.3);

    assert(b1.i == 2);
    assert(b1.c == 'b');
    assert(b1.d == 3.4);

    assert(f2.c == 'c');
    assert(f2.i == 3);
    assert(f2.d == 5.6);

    assert(b2.i == 4);
    assert(b2.c == 'z');
    assert(b2.d == 7.8);

    return 0;
}

Solution

  • You may like to align std::uint8_t buf[100] storage because unaligned access is undefined behaviour:

    aligned_storage<sizeof(Tup) * 100, alignof(Tup)>::type buf;
    

    (originally you had 100 bytes, this is for 100 Tups).

    When you map pages they start on at least 4k boundary on x86. If your storage starts on a page start then that storage is suitably aligned for any power-2 alignment up to 4k.


    I'm worried that writing the members of a std::tuple to my mapped region's address, and then later casting that address back to a std::tuple is going to break.

    As long as the applications communicating through mapped memory use the same ABI, that works as expected.