Search code examples
c++linuxsystem-callsmmap

mremap will not expand past size of one page


I'm creating a template class with a dynamic array that behaves like std::vector but the underlying array is stored in shared memory so that it can be shared between processes. At this time, there is no synchronization built in, just working on getting the memory mapping resizing to work as intended and the code is a minimal example to show where the error is coming from

The issue I'm having is that once I resize the memory passed one page size, if I try to access the second page I get a SIGBUS error. If I allocate a larger page size with mmap, for example I've tried up to 1 MB, it will allocate that amount fine but if I try to resize that mapping larger I will again get a SIGBUS error if I try to access past the original bounds of the mapping

template<typename T>
class DistributedVector {
private:
    T* create_shared_memory(size_t size) {
        int protection = PROT_READ | PROT_WRITE;
        int visibility = MAP_SHARED;

        m_shm_fd = memfd_create("shm", 0);
        ftruncate(m_shm_fd, size);
        void* result = mmap(nullptr, size * sizeof(T), protection, visibility, m_shm_fd, 0);
        if (result == MAP_FAILED){
            std::cerr << "Mapping failed!\n";
        }
        file_size = lseek(m_shm_fd, 0, SEEK_END);
        return (T*)result;
    }

    void resize_shared_memory(T* addr, size_t size, size_t new_size){
        auto temp = create_shared_memory(new_size);
        if (temp == MAP_FAILED) {
            std::cerr << "Mapping failed!\n";
        }
        memcpy(temp, m_begin, size*sizeof(T));
        munmap(m_begin, size);
        m_begin = temp;
        m_end = m_begin + size;
        m_end_cap = m_begin + new_size;
    }

public:
    DistributedVector() {
        m_begin = create_shared_memory(INITIAL_VEC_CAPACITY);
        m_end_cap = m_begin + INITIAL_VEC_CAPACITY;
        m_end = m_begin;
    }

    size_t size() {
        return m_end - m_begin;
    }

    size_t capacity() {
        return m_end_cap - m_begin;
    }

    T* data(){
        return m_begin;
    }

    T at(size_t index){
        if (index < (m_end - m_begin)) {
            return *(m_begin+index);
        } else {
            return -1;
        }
    }

    T push_back(T ele){
        if (m_end == m_end_cap){
            resize_shared_memory(m_begin,capacity(), 2*capacity());
        }
        *m_end = ele;
        m_end++;
        return *(m_end-1);
    }

private:
    T* m_begin;
    T* m_end;
    T* m_end_cap;
    int m_shm_fd;
    size_t file_size;

};

Running this test:

std::vector<int> v;
dtl::DistributedVector<int> d;

assert(v.size() == d.size());

for(int i = 0; i < 20000000; ++i){
    v.push_back(i); 
    d.push_back(i);
    auto v_i = v.at(i);
    auto d_i = d.at(i);
    assert(v_i == d_i);
    if (i % 100 == 0)
        std::cout << v_i << ":" << d_i << '\n';
}
std::cout << '\n';

assert(v.size() == d.size());

Not seeing anything in the documentation about a limit to mremap page size, and if I manually allocate a new memory section with mmap, copy the data over, and munmap the old page, it works fine. Is there something in mremap that I am overlooking?

edit: updated code to work with memfd_create, ftruncate, and resize by mmap a new memory area and munmap the old area, and now the code works as intended


Solution

  • mremap cannot resize mapping that is both shared and anonymous.

    Every virtual address region has an optional file object associated with it that is used to fetch new pages and writeback old modified ones. Since anonymous mappings have no such file, there is nowhere to store pages when you shrink it and nowhere to get more pages when you expand it. Kernel generates SIGBUS when thread is accessing file out of range via mapping - happens when file was shrunk after it was mapped.

    In theory kernel could do something to make it work, but so far it doesn't and it would still cause problems with how such mapping should behave as well as complexities with tracking used portions of that mapping.

    What you need is to provide a file to back your mapping. To resize it you will have to first resize the file, then remap the mapping. If you do not want to reserve a file on disk, use memfd_create to create anonymous file in memory - this way you also get an fd which can be send to another process through a Unix socket.