Search code examples
pythonjupyter-notebooknbconvert

Jupyter Notebook nbconvert without magic commands/ w/o markdown


I have a Jupyter notebook and I'd like to convert it into a Python script using the nbconvert command from within the Jupyter notebook.

I have included the following line at the end of the notebook:

!jupyter nbconvert --to script <filename>.ipynb

This creates a Python script. However, I'd like the resulting .py file to have the following properties:

  1. No input statements, such as:

    # In[27]:

  2. No markdown, including statements such as:

    # coding: utf-8

  3. Ignore %magic commands such as:

    1. %matplotlib inline
    2. !jupyter nbconvert --to script <filename>.ipynb, i.e. the command within the notebook that executes the Python conversion

    Currently, the %magic commands get translated to the form: get_ipython().magic(...), but these are not necessarily recognized in Python.


Solution

  • One way to get control of what appears in the output is to tag the cells that you don't want in the output and then use the TagRemovePreprocessor to remove the cells.

    enter image description here

    The code below also uses the exclude_markdown function in the TemplateExporter to remove markdown.

    !jupyter nbconvert \
        --TagRemovePreprocessor.enabled=True \
        --TagRemovePreprocessor.remove_cell_tags="['parameters']" \
        --TemplateExporter.exclude_markdown=True \
        --to python "notebook_with_parameters_removed.ipynb"
    

    To remove the commented lines and the input statement markets (like # [1]), I believe you'll need to post-process the Python file with something like the following in the cell after the cell you call !jupyter nbconvert from (note that this is Python 3 code):

    import re
    from pathlib import Path
    filename = Path.cwd() / 'notebook_with_parameters_removed.py'
    code_text = filename.read_text().split('\n')
    lines = [line for line in code_text if len(line) == 0 or 
            (line[0] != '#' and 'get_ipython()' not in line)]
    clean_code = '\n'.join(lines)
    clean_code = re.sub(r'\n{2,}', '\n\n', clean_code)
    filename.write_text(clean_code.strip())