Following related question solution I created docker container which loads GoogleNews-vectors-negative300 KeyedVector inside docker container and load it all to memory
KeyedVectors.load(model_path, mmap='r')
word_vectors.most_similar('stuff')
Also I have another Docker container which provides REST API which loads this model with
KeyedVectors.load(model_path, mmap='r')
And I observe that fully loaded container takes more than 5GB of memory and each gunicorn worker takes 1.7 GB of memory.
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
acbfd080ab50 vectorizer_model_loader_1 0.00% 5.141GiB / 15.55GiB 33.07% 24.9kB / 0B 32.9MB / 0B 15
1a9ad3dfdb8d vectorizer_vectorizer_1 0.94% 1.771GiB / 15.55GiB 11.39% 26.6kB / 0B 277MB / 0B 17
However, I expect that all this processes share same memory for KeyedVector, so it only takes 5.4 GB shared between all containers.
Have someone tried to achieve that and succeed?
edit: I tried following code snippet and it indeed share same memory across different containers.
import mmap
from threading import Semaphore
with open("data/GoogleNews-vectors-negative300.bin", "rb") as f:
# memory-map the file, size 0 means whole file
fileno = f.fileno()
mm = mmap.mmap(fileno, 0, access=mmap.ACCESS_READ)
# read whole content
mm.read()
Semaphore(0).acquire()
# close the map
mm.close()
So the problem that KeyedVectors.load(model_path, mmap='r')
don't share memory
edit2:
Studying gensim's source code I see that np.load(subname(fname, attrib), mmap_mode=mmap)
is called to open memmaped file. Following code sample shares memory across multiple container.
from threading import Semaphore
import numpy as np
data = np.load('data/native_format.bin.vectors.npy', mmap_mode='r')
print(data.shape)
# load whole file to memory
print(data.mean())
Semaphore(0).acquire()
After extensive debugging I figured out that mmap works as expected for numpy arrays in KeyedVectors
object.
However, KeyedVectors have other attributes like self.vocab
, self.index2word
and self.index2entity
which are not shared and consumes ~1.7 GB of memory for each object.