Search code examples
jupyter-notebookjupyter-labsnakemakejupyter-kernel

Snakemake - Jupyter lab notebook does not find kernel


When writing a program in a jupyter lab notebook using R or python I install specific conda environments as kernels to access that environment-specific packages from a single jupyter lab installation in the base conda environment

After finishing developing the notebook I want to plug it into my snakemake file to ensure later reproducibility and of course, I do that with the respective conda environment .yaml file so that all the needed packages/libraries are provided.

Now comes the problem: the rule which references the notebook, is not reproducible on another machine/environment, as it tries to access/run a kernel which is specific to my development environment

Does anyone have a workaround or solution to this particular problem?

EDIT: More detailed steps leading to my problem

  1. Setup a conda environment (matplotlib_env) for a specific tool or task (here: matplotlib)
conda create -n matplotlib_env python=3.8 ipykernel matplotlib nbconvert
  1. I created a ipython kernel so my jupyter-lab instance from the base environment can access the conda environment and the respective packages (matplotlib)
python -m ipykernel install --user --name matplotlib_env --display-name "Python_maplotlib"
  1. I created a short notebook (1) that uses the kernel configured in step 2 (Pyhton_matplotlib 2)
import matplotlib.pyplot as plt
plt.plot([1,2,3],[1,4,9])
plt.savefig('exp.png')
  1. Having finished the task I want to plug it into my snakemake workflow for management and reproducibility reasons. For that, I define a rule, with the respective notebook and conda environment .yaml file (automatically generated via: conda env export > matplotlib_env.yaml).
rule make_plot:
    output:
        "exp.png"
    conda:
        "matplotlib_env.yaml"
    notebook:
        "make_plot.ipynb"
  1. Executing the rule works fine as long as the original conda environment (with the configured kernel) still exists. As soon as I remove the original conda environment (optionally also the kernel from the kernelspec list) I get an error message, as the kernel can not be found anymore and the snakemake generated conda environment does not possess the kernel, the notebook is looking for.
snakemake -p --cores 1 --use-conda make_plot

ERROR message after removing the original environment

[NbConvertApp] ERROR | Failed to run command:
...
FileNotFoundError: [Errno 2] No such file or directory: '/home/miniconda3/envs/matplotlib_env/bin/python'

ERROR message after removing kernel from kernel list

...
raise NoSuchKernel(kernel_name)
jupyter_client.kernelspec.NoSuchKernel: No such kernel named matplotlib_env

Solution

  • I found a workaround: Before saving your notebook the last time switch kernel to the default python or R kernel from your base environment. These always have the same default name post-installation (eg "Python 3"). Thereby, when executed via snakemake the notebook looks for the default Python/R kernel and finds the new installation provided via the snakemake --use-conda option.

    This is not the most elegant version, but if you need/want to plug notebooks into your workflow, it is good enough.

    The most rigorous would be to turn the notebook, once done, into a script with clearly defined input and outputs. This script is then plugged into the snakefile, via a corresponding rule, instead of the notebook.