I'm creating a template class with a dynamic array that behaves like std::vector
but the underlying array is stored in shared memory so that it can be shared between processes. At this time, there is no synchronization built in, just working on getting the memory mapping resizing to work as intended and the code is a minimal example to show where the error is coming from
The issue I'm having is that once I resize the memory passed one page size, if I try to access the second page I get a SIGBUS
error. If I allocate a larger page size with mmap
, for example I've tried up to 1 MB, it will allocate that amount fine but if I try to resize that mapping larger I will again get a SIGBUS
error if I try to access past the original bounds of the mapping
template<typename T>
class DistributedVector {
private:
T* create_shared_memory(size_t size) {
int protection = PROT_READ | PROT_WRITE;
int visibility = MAP_SHARED;
m_shm_fd = memfd_create("shm", 0);
ftruncate(m_shm_fd, size);
void* result = mmap(nullptr, size * sizeof(T), protection, visibility, m_shm_fd, 0);
if (result == MAP_FAILED){
std::cerr << "Mapping failed!\n";
}
file_size = lseek(m_shm_fd, 0, SEEK_END);
return (T*)result;
}
void resize_shared_memory(T* addr, size_t size, size_t new_size){
auto temp = create_shared_memory(new_size);
if (temp == MAP_FAILED) {
std::cerr << "Mapping failed!\n";
}
memcpy(temp, m_begin, size*sizeof(T));
munmap(m_begin, size);
m_begin = temp;
m_end = m_begin + size;
m_end_cap = m_begin + new_size;
}
public:
DistributedVector() {
m_begin = create_shared_memory(INITIAL_VEC_CAPACITY);
m_end_cap = m_begin + INITIAL_VEC_CAPACITY;
m_end = m_begin;
}
size_t size() {
return m_end - m_begin;
}
size_t capacity() {
return m_end_cap - m_begin;
}
T* data(){
return m_begin;
}
T at(size_t index){
if (index < (m_end - m_begin)) {
return *(m_begin+index);
} else {
return -1;
}
}
T push_back(T ele){
if (m_end == m_end_cap){
resize_shared_memory(m_begin,capacity(), 2*capacity());
}
*m_end = ele;
m_end++;
return *(m_end-1);
}
private:
T* m_begin;
T* m_end;
T* m_end_cap;
int m_shm_fd;
size_t file_size;
};
Running this test:
std::vector<int> v;
dtl::DistributedVector<int> d;
assert(v.size() == d.size());
for(int i = 0; i < 20000000; ++i){
v.push_back(i);
d.push_back(i);
auto v_i = v.at(i);
auto d_i = d.at(i);
assert(v_i == d_i);
if (i % 100 == 0)
std::cout << v_i << ":" << d_i << '\n';
}
std::cout << '\n';
assert(v.size() == d.size());
Not seeing anything in the documentation about a limit to mremap
page size, and if I manually allocate a new memory section with mmap
, copy the data over, and munmap
the old page, it works fine. Is there something in mremap
that I am overlooking?
edit: updated code to work with memfd_create
, ftruncate
, and resize by mmap
a new memory area and munmap
the old area, and now the code works as intended
mremap
cannot resize mapping that is both shared and anonymous.
Every virtual address region has an optional file object associated with it that is used to fetch new pages and writeback old modified ones. Since anonymous mappings have no such file, there is nowhere to store pages when you shrink it and nowhere to get more pages when you expand it. Kernel generates SIGBUS
when thread is accessing file out of range via mapping - happens when file was shrunk after it was mapped.
In theory kernel could do something to make it work, but so far it doesn't and it would still cause problems with how such mapping should behave as well as complexities with tracking used portions of that mapping.
What you need is to provide a file to back your mapping. To resize it you will have to first resize the file, then remap the mapping. If you do not want to reserve a file on disk, use memfd_create
to create anonymous file in memory - this way you also get an fd
which can be send to another process through a Unix socket.