Search code examples
pythonvisual-studio-codejupyter-notebookexportcell

How do I only show certain cells input or output when exporting a Juypter Notebook from VSCode?


I only want certain cells and certain cell outputs to show up when I export my Juypter Notebook from VSCode. I have not been able to get an answer that works from Google, StackOverflow, and ChatGPT.

So when I export the .ipynb file to HTML in VSCode, how do I modify which cells are included in the HTML and which are not? For example, what would I do to include just the ouptut of the cell below and not the actual code?

import pandas as pd 
import seaborn as sns
df = pd.read_csv(file.csv)
sns.histplot(df['Variable 1']

This post seems to indicate the best/only option is tagging cells then removing them with nbconvert. This seems inefficient in VSCode, especially compared to the easy output = FALSE or echo = FALSE in RStudio.

This seems like it should be an easy and common question but I am getting no good solutions from the internet. ChatGPT suggested include #hide-in-export to the cells I didn't want but that didn't work The StackOverflow post I linked suggested using TagRemovePreprocessor with nbconvert and marking all the cells I want gone but that seems so clunky. Follow-up question: If tagging cells and removing them in export with nbconvert, what is the fastest way to tag cells in VSCode?


Solution

  • I still don't know if there is an easier way but here is what I have done with help from ChatGPT, this blog post, and this StackOverflow answer.

    First, have a function that adds cell tags to the certain cells you want to hide:

    import json
    def add_cell_tag(nb_path, tag, cell_indices):
        # Open the .ipynb file
        with open(nb_path, 'r', encoding='utf-8') as f:
            nb = json.load(f)
        # Get the cells from the notebook
        cells = nb['cells']
        # Add the tag to the specified cells
        for index in cell_indices:
            cell = cells[index]
            if 'metadata' not in cell:
                cell['metadata'] = {}
            if 'tags' not in cell['metadata']:
                cell['metadata']['tags'] = []
            cell['metadata']['tags'].append(tag)
        # Save the modified notebook
        with open(nb_path, 'w', encoding='utf-8') as f:
            json.dump(nb, f)
    

    Second, run the function and add a tag (can be any string) to the cells you want to hide in the HTML export:

    add_cell_tag(nb_path, 'hide-code', [0, 1, 2])
    

    Finally, use nbconvert in the terminal to export and filter the notebook:

    jupyter nbconvert --to html --TagRemovePreprocessor.remove_cell_tags=hide-code  path/to/notebook.ipynb
    

    The cells made be entirely removed or just the output or just the input: TagRemovePreprocessor.remove_input_tags TagRemovePreprocessor.remove_single_output_tags TagRemovePreprocessor.remove_all_outputs_tags

    Not sure the difference between those last two. Additionally, I had a helper function to count the cells in the notebook and one to clear all tags in the notebook.