Search code examples
pythonmatplotlibjupyter-notebookgoogle-colaboratorystdout

`%%capture` jupyter notebook or colab doesn't work for storing cell output that display image or table pandas


I have read docs about %%capture cap. It only work for text representation. I expect it stored everything what has been displayed into cell output; image, table, even html element.

And then I can load it into another cell with same identical output:

CELL_A

%%capture cap

import matplotlib.pyplot as plt

# Data
x = [1, 2]
y = [3, 4]

# Create a simple plot
plt.plot(x, y)

# Add labels to the axes
plt.xlabel('X-axis')
plt.ylabel('Y-axis')

# Add a title to the plot
plt.title('Simple Plot Example')

# Show the plot
plt.show()

# Save the captured output to a text file
with open('stdout.txt', 'w') as file:
  file.write(cap.stdout)

CELL_B

#@title Reloading CELL_A output

with open('stdout.txt', 'r') as file:
  cell_a_out = file.read()

display(cell_a_out)

It should be display matplotlib image. I know matplotlib provide save figure method to save the image. But I expect to save everything that has been displayed in the cell output, not only image. Therefore if the cell returning output table of pandas and matplotlib image it can be stored in the single file.


Solution

  • As I discussed in my comments your post reflects several key misunderstandings.

    • What is being displayed by matplotlib with your example plotting code in your OP (your CELL_A). (It isn't an image!)
    • That you can use captured output in the same cell you call it (see your CELL_A in your OP). Here takluyver puts it succinctly as, "%%capture won't define the variable until after the code inside it has run, so it's not meant to be used in the cell it's capturing."

    Here is how you can do what you say in the title and more using %%capture.

    First cell (your CELL_A but removing the ill-informed idea of using the captured object before cell completely run):

    CELL 1

    %%capture cap
    
    import matplotlib.pyplot as plt
    
    # Data
    x = [1, 2]
    y = [3, 4]
    
    # Create a simple plot
    plt.plot(x, y)
    
    # Add labels to the axes
    plt.xlabel('X-axis')
    plt.ylabel('Y-axis')
    
    # Add a title to the plot
    plt.title('Simple Plot Example')
    
    # Show the plot
    plt.show()
    

    (By the way, plt.show() is generally no longer needed because when using the Python kernel, modern Jupyter sees a matplotlib plot object being made and displays it whether plt.show() is there or not. Though here because you want to use %%capture it's good to use because otherwise you may get cap including the last expression evaluated as part of the cap output, for example the plot title object.)

    Note the object made by the code in the previous cell if you remove the %%capture line isn't an image at this point. It would be a matplotlib plot object that Jupyter would handle displaying.

    Reshowing the captured output based on here in Cell #2:

    CELL 2

    #Reloading CELL 1 output
    cap.show()
    

    (Alternatively,cap() as the content of the cell will also work to later display the plot.)

    Alternative CELL 2 option

    from IPython.display import display
    display(cap.outputs[0])
    

    Okay, what if you did want to make the object generated in cell #1 an image and display it in the notebook?
    First, we'll assign the plot object to a handle that we can call again make an it into an image and in the second step display that image in the notebook.

    ALTERNATIVE CELL 1

    %%capture cap
    
    import matplotlib.pyplot as plt
    
    # Data
    x = [1, 2]
    y = [3, 4]
    
    # Create a simple plot
    my_plot = plt.plot(x, y)
    
    # Add labels to the axes
    plt.xlabel('X-axis')
    plt.ylabel('Y-axis')
    
    # Add a title to the plot
    plt.title('Simple Plot Example')
    
    # Show the plot
    plt.show()
    

    Note the only difference is I assigned a handle to the plot object for ease in recalling subsequently.

    Next based on here we can show the plot in another cell. (The %%capture cap swallowed the output and so it wasn't shown as output from the first cell already.)

    ALTERNATIVE CELL 2

    my_plot[0].figure
    

    Let's save that matplotlib object as an image file:

    ALTERNATIVE CELL 3

    # make an image file of the matplotlib object
    my_plot[0].figure.savefig("my_plot_as_image.png")
    

    ALTERNATIVE CELL 4

    #display the image of the plot made
    from IPython.display import Image
    Image("my_plot_as_image.png")
    

    Note that when the Jupyter .ipynb file gets saved it will encode the image file within the saved .ipynb as Base64 code. (You can even recover as a file that embedded image from within the code later if you need to, see here.)

    If you wanted to combine it with a dataframe there are number of options, see 'related' below for some ideas, as well as dataframe_image if you wanted to do it with an image if the plot is stored as an image, too.
    Then, you could use jupyter nbconvert commands to save HTML, see here. And extract the HTML you wanted for an alternative option to what you describe for your answer to get HTML of your items.

    I like PDFs for reports and so I't probably combine images of the plot and dataframe into PDF using ReportLab or Pillow.

    Finally...
    Your title and original post also seems to reflect an XY problem. You seem to at the end of your post are trying to do this to make a file that is just the plot and dataframe. In the future research and ask about the goal you have in mind and not how you think you might be able to accomplish it. See why not to do that here.

    Related: