I am encountering a segfault when I make a reticulated call to
matplotlib.pyplot.plot()
.
Steps to produce error:
Create a Dockerfile
with the contents:
FROM rocker/r-ver:latest
RUN apt update && apt install -y python3.8-venv python3.8-dev
RUN install2.r --error reticulate
COPY test.R /root/
Create a file test.R
(in the same location) with the contents:
reticulate::virtualenv_create(
envname = "./venv",
packages = c("matplotlib")
)
reticulate::use_virtualenv("./venv")
reticulate::py_run_string("import matplotlib.pyplot as plt; plt.plot([1, 2, 3], [1, 2, 3])")
Build an image from the Dockerfile
: docker build . --tag="segfault-reprex"
Try to run test.R
in the running container: docker run segfault-reprex Rscript /root/test.R
. This gives the full traceback listed below.
Full traceback
Using Python: /usr/bin/python3.8
Creating virtual environment './venv' ... Done!
Installing packages: 'pip', 'wheel', 'setuptools', 'matplotlib'
Collecting pip
Downloading pip-21.3.1-py3-none-any.whl (1.7 MB)
Collecting wheel
Downloading wheel-0.37.1-py2.py3-none-any.whl (35 kB)
Collecting setuptools
Downloading setuptools-60.5.0-py3-none-any.whl (958 kB)
Collecting matplotlib
Downloading matplotlib-3.5.1-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.whl (11.3 MB)
Collecting kiwisolver>=1.0.1
Downloading kiwisolver-1.3.2-cp38-cp38-manylinux_2_5_x86_64.manylinux1_x86_64.whl (1.2 MB)
Collecting fonttools>=4.22.0
Downloading fonttools-4.28.5-py3-none-any.whl (890 kB)
Collecting packaging>=20.0
Downloading packaging-21.3-py3-none-any.whl (40 kB)
Collecting cycler>=0.10
Downloading cycler-0.11.0-py3-none-any.whl (6.4 kB)
Collecting numpy>=1.17
Downloading numpy-1.22.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (16.8 MB)
Collecting pillow>=6.2.0
Downloading Pillow-9.0.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.3 MB)
Collecting python-dateutil>=2.7
Downloading python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)
Collecting pyparsing>=2.2.1
Downloading pyparsing-3.0.6-py3-none-any.whl (97 kB)
Collecting six>=1.5
Downloading six-1.16.0-py2.py3-none-any.whl (11 kB)
Installing collected packages: pip, wheel, setuptools, kiwisolver, fonttools, pyparsing, packaging, cycler, numpy, pillow, six, python-dateutil, matplotlib
Attempting uninstall: pip
Found existing installation: pip 20.0.2
Uninstalling pip-20.0.2:
Successfully uninstalled pip-20.0.2
Attempting uninstall: setuptools
Found existing installation: setuptools 44.0.0
Uninstalling setuptools-44.0.0:
Successfully uninstalled setuptools-44.0.0
Successfully installed cycler-0.11.0 fonttools-4.28.5 kiwisolver-1.3.2 matplotlib-3.5.1 numpy-1.22.0 packaging-21.3 pillow-9.0.0 pip-21.3.1 pyparsing-3.0.6 python-dateutil-2.8.2 setuptools-60.5.0 six-1.16.0 wheel-0.37.1
Virtual environment './venv' successfully created.
*** caught segfault ***
address 0x7ffaeabe1100, cause 'memory not mapped'
Traceback:
1: py_run_string_impl(code, local, convert)
2: reticulate::py_run_string("import matplotlib.pyplot as plt; plt.plot([1, 2, 3], [1, 2, 3])")
An irrecoverable exception occurred. R is aborting now ...
Things I have noted:
A minimal example inovling eg. the pandas package, rather than matplotlib,
runs successfully. ie. if test.R
contains:
reticulate::virtualenv_create(
envname = "./venv",
packages = c("pandas")
)
reticulate::use_virtualenv("./venv")
reticulate::py_run_string("import pandas as pd; df = pd.DataFrame()")
If you enter the container interactively (docker run -it segfault-reprex /bin/bash
),
run test.R
(Rscript /root/test.R
), activate the resulting
virutalenv (source /root/venv/bin/activate
), you can use matplotlib fine from
python (python -c "import matplotlib.pyplot as plt; plt.plot([1, 2, 3], [1, 2, 3])"
)
The reticulate documentation states that:
for reticulate to bind to a version of Python it must be compiled with shared library support (i.e. with the --enable-shared flag)
docker run -it segfault-reprex /usr/bin/python3 -c "import sysconfig; print(sysconfig.get_config_vars('Py_ENABLE_SHARED'))"
shows that the container's Python was compiled with shared library support
The problem is that the R binary in rocker/r-ver:latest
is compiled against a different BLAS library to the one which the numpy on PyPI is compiled against.
This was explained to me by Tomasz Kalinowski here.
The solution is to ensure numpy uses the same BLAS libraries as rocker/r-ver
's R binary does. An easy way to ensure this is to compile numpy from source. This compilation could be performed at either image build-time or container runtime.
To compile numpy at container runtime we can leave our Dockerfile
as is, and add a call to system2()
after our initial call to reticulate::virtualenv_create()
. Altering test.R
to become:
reticulate::virtualenv_create(
envname = "./venv",
packages = c("matplotlib")
)
system2("./venv/bin/pip3", c("install",
"--no-binary='numpy'",
"numpy",
"--ignore-installed"))
reticulate::use_virtualenv("./venv")
reticulate::py_run_string("import matplotlib.pyplot as plt;plt.plot([1, 2, 3], [1, 2, 3])")
After rebuilding our image, we can run test.R
in this container without segfault!
Compiling numpy at runtime adds ~3 mins to every call of our R script!
A better solution could be to perform this compilation at image build-time. This would mean we'd only have to wait those ~3 minutes once (at image build time), rather than every time we run our script!
A Dockerfile
to do so could look like:
FROM rocker/r-ver:latest
RUN apt update && apt install -y python3 python3-dev python3-venv
RUN install2.r --error reticulate
# Create a venv
RUN python3 -m venv /root/venv
# Compile numpy from source into venv
RUN /root/venv/bin/pip3 install --no-binary="numpy" numpy --ignore-installed
COPY test.R /root/
The accompanying test.R file would then make use of reticulate::virtualenv_install()
as:
reticulate::virtualenv_install(
envname = "/root/venv",
packages = c("matplotlib")
)
reticulate::use_virtualenv("/root/venv")
reticulate::py_run_string("import matplotlib.pyplot as plt;plt.plot([1, 2, 3], [1, 2, 3])")
NB. when running a container from the image with numpy already compiled, you'll need to run as either root (-u="root"
), or else change the permissions on the compiled numpy version in the Dockerfile
; otherwise you will encounter a permissions error.