Search code examples
pythonpandas

Post a pandas dataframe from Jupyter Notebooks into a Stack Overflow problem


What are the steps to post a Pandas dataframe in a Stack Overflow question?

I found: How to make good reproducible pandas examples.
I followed the instructions and used pd.read_clipboard, but I still had to spend a significant amount of time formatting the table to make it look correct.

I also found: How to display a pandas dataframe on a Stack Overflow question body.

I tried to copy the dataframe from Jupyter and paste it into a Blockquote. As mentioned, I also ran pd.read_clipboard('\s\s+') in Jupyter to copy it to the clipboard and then pasted it into a Blockquote.
I also tried creating a table and pasting the values in the table.
All of these methods required that I tweak the formatting to make it look properly formatted.

An example dataframe:

df = pd.DataFrame(
    [['Captain', 'Crunch', 72],
     ['Trix', 'Rabbit', 36],
     ['Count', 'Chocula', 41],
     ['Tony', 'Tiger',  54],
     ['Buzz', 'Bee', 28],
     ['Toucan', 'Sam', 38]],
    columns=['first_name', 'last_name', 'age'])

Solution

  • .to_markdown()

    The easiest method I found was to use print(df.to_markdown()).

    This will convert the data into mkd format which can be interpreted by SO. For example with your dataframe, the output is:

    first_name last_name age
    0 Captain Crunch 72
    1 Trix 36 Rabbit
    2 Count Chocula 41
    3 Tony 54 Tiger
    4 Buzz 28 Bee
    5 Toucan Sam 38

    Note you might need to install tabulate module.

    .to_dict()

    Another option is to use df.head().to_dict('list'), but it might not be the best one for large datasets (will work for minimum reproducible examples though)

    {'first_name': ['Captain', 'Trix', 'Count', 'Tony', 'Buzz'], 'last_name': ['Crunch', 36, 'Chocula', 54, 28], 'age': [72, 'Rabbit', 41, 'Tiger', 'Bee']}
    

    Anyone can use this by passing it through pd.DataFrame()