Search code examples
pythonbashjupyter-notebookjupyterpapermill

Bash script with multiple papermill commands does not fail on notebook errors


I have a refresh_data.sh file which contains multiple papermill commands, for example:

papermill notebook_1.ipynb output_1.ipynb -p start "2017-12-01" -p date "2017-12-31"
papermill notebook_2.ipynb output_2.ipynb -p start "2018-01-01" -p date "2018-01-31"

If I get an error while it is running the first notebook, the process continues executing the second one.

In other words, an error in one of the notebooks doesn't "break" the overall script.

As far as I remember with normal python scripts if there is an error in one of the commands within the bash script it breaks the execution of the entire script.

What is the standard behaviour of a bash script in this case? Can I change it so that it stops as soon as there is an error?


Solution

  • If your bash script is configured with: set -e it will fail if a command errors out:

    Automatic exit from bash shell script on error

    #!/bin/bash
    set -e
    # Any subsequent(*) commands which fail will cause the shell script to exit immediately
    

    You can run papermill using:

    --log-output to get more information about why your notebook fail.

    papermill "${INPUT_NOTEBOOK_PATH}" "${OUTPUT_NOTEBOOK_PATH}" --log-output
    

    To capture notebook execution result you can always capture the result of any previous command using $?:

      papermill "${INPUT_NOTEBOOK_PATH}" "${OUTPUT_NOTEBOOK_PATH}" --log-output
      notebook_result=$?
      if [[ ${notebook_result} -eq 0 ]]; then
        echo "All good"
      else
        echo $notebook_result
      fi