Search code examples
pyiron

Setting environment variables before the execution of the pyiron wrapper on remote cluster


I use a jobfile for SLURM in ~/pyiron/resources/queues/, which looks roughly like this:

#!/bin/bash
#SBATCH --output=time.out
#SBATCH --job-name={{job_name}}
#SBATCH --workdir={{working_directory}}
#SBATCH --get-user-env=L
#SBATCH --partition=cpu

module load some_python_module
export PYTHONPATH=path/to/lib:$PYTHONPATH

echo {{command}}

As you can see, I need to load a module to access the correct python version before calling "python -m pyiron.base.job.wrappercmd ..." and I also want to set the PYTHONPATH variable.

Setting the environment directly in the SLURM jobfile is of course working, but it seems very inconvenient, because I need a new jobfile under ~/pyiron/resources/queues/ whenever I want to run a calculation with a slightly different environment. Ideally, I would like to be able to adjust the environment directly in the Jupyter notebook. Something like an {{environment}} block in the above jobile, which can be configured via Jupyter, seems to a nice solution. As far as I can tell, this is impossible with the current version of pyiron and pysqa. Is there a similar solution available?

As an alternative, I could also imagine to store the above jobfile close to the Jupyter notebook. This would also ease the reproducibility for my colleagues. Is there an option to define a specific file to be used as a jinja2-template for the jobile?

I could achieve my intended setup by writing a temporary jobfile under ~/pyiron/resources/queues/ via Jupyter before running the pyiron job, but this feels like quite a hacky solution.

Thank you very much,

Florian


Solution

  • To explain the example in a bit more detail:

    I create a notebook named: reading.ipynb with the following content:

    import subprocess
    subprocess.check_output("echo ${My_SPECIAL_VAR}", shell=True)
    

    This reads the environment variable My_SPECIAL_VAR.

    I can now submit this job using a second jupyter notebook:

    import os
    os.environ["My_SPECIAL_VAR"] = "SoSpecial"
    from pyiron import Project
    pr = Project("envjob")
    job = pr.create_job(pr.job_type.ScriptJob, "script")
    job.script_path = "readenv.ipynb"
    job.server.queue = "cm"
    job.run()
    

    In this case I first set the environment variable and then submit a script job, the script job is able to read the corresponding environment variable as it is forwarded using the --get-user-env=L option. So you should be able to define the environment in the jupyter notebook which you use to submit the calculation.