I'm running
NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
with plenty of memory
total used free shared buff/cache available
Mem: 125G 3.3G 104G 879M 17G 120G
64 bit Anaconda https://repo.anaconda.com/archive/Anaconda3-2022.10-Linux-x86_64.sh
I have set max_buffer_size to 64GB in both jupyter_notebook_config.json and jupyter_notebook_config.py, and just to make sure specify it on the command line:
jupyter notebook --certfile=ssl/mycert.perm --keyfile ssl/mykey.key --no-browser --NotebookApp.max_buffer_size=64000000000
And also
cat /proc/sys/vm/overcommit_memory
1
I run a simple memory allocation snippet:
size = int(6e9)
chunk = size * ['r']
print (chunk.__sizeof__()/1e9)
as a standalone .py file and it works:
python ../readgzip.py
48.00000004
happily reporting that it allocated 48GB for my list.
However, the same code in a jupyter notebook only works up to 7.76GB:
size = int(9.7e8)
chunk = size * ['r']
print (chunk.__sizeof__()/1e9)
7.76000004
and fails after increase the array size from 9.7e8 to 9.75e8
---------------------------------------------------------------------------
MemoryError Traceback (most recent call last)
/tmp/ipykernel_12328/3436837519.py in <module>
1 size = int(9.75e8)
----> 2 chunk = size * ['r']
3 print (chunk.__sizeof__()/1e9)
MemoryError:
Also, on my home Windows11 machine with 64GB of memory I can easily run the code above and allocate 32GB of memory.
Seems like, I'm missing something about the Jupyter setup on Linux
What am I missing?
Thank you
On linux (and possibly other OSs, but I'm not sure), MemoryError
doesn't mean that the machine's memory has exhausted (in which case OOM killer would usually be invoked), but rather that the process has reached a limit (A.K.A ulimit, obsolete) over which the kernel is not willing to allocate it with additional memory.
You can use python's resource library to check the current process's limits (and possibly set them, having sufficient permissions). Here's an example:
$ prlimit --as
RESOURCE DESCRIPTION SOFT HARD UNITS
AS address space limit unlimited unlimited bytes
$ prlimit --pid=$$ --as=$((1024*1024*20)):
prlimit --as
RESOURCE DESCRIPTION SOFT HARD UNITS
AS address space limit 20971520 unlimited bytes
$ python
Python 3.10.6 (main, Nov 14 2022, 16:10:14) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import resource
>>> resource.getrlimit(resource.RLIMIT_AS)
(20971520, -1)
>>> longstr = "r" * 1024*1024*10
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
MemoryError
>>> longstr = "r" * 1024*1024*3
>>> resource.setrlimit(resource.RLIMIT_AS, (1024*1024*30, resource.RLIM_INFINITY))
>>> resource.getrlimit(resource.RLIMIT_AS)
(31457280, -1)
>>> longstr = "r" * 1024*1024*10
>>> len(longstr)
10485760
>>>
Increasing Jupyter-Notebook's limits should be done from outside the process itself, as it is generally not recommended running python processes with superuser privileges.
Read more about linux's prlimit(1) utility and the getrlimit(2) system call.