Search code examples
pythonpandasclipboard

How to copy/paste DataFrame from Stack Overflow into Python


In questions and answers, users very often post an example DataFrame which their question/answer works with:

In []: x
Out[]: 
   bar  foo
0    4    1
1    5    2
2    6    3

It'd be really useful to be able to get this DataFrame into my Python interpreter so I can start debugging the question, or testing the answer.

How can I do this?


Solution

  • Pandas is written by people that really know what people want to do.

    Since version 0.13 there's a function pd.read_clipboard which is absurdly effective at making this "just work".

    Copy and paste the part of the code in the question that starts bar foo, (i.e. the DataFrame) and do this in a Python interpreter:

    In [53]: import pandas as pd
    In [54]: df = pd.read_clipboard()
    
    In [55]: df
    Out[55]: 
       bar  foo
    0    4    1
    1    5    2
    2    6    3
    

    Caveats

    • Don't include the iPython In or Out stuff or it won't work
    • If you have a named index, you currently need to add engine='python' (see this issue on GitHub). The 'c' engine is currently broken when the index is named.
    • It's not brilliant at MultiIndexes:

    Try this:

                          0         1         2
    level1 level2                              
    foo    a       0.518444  0.239354  0.364764
           b       0.377863  0.912586  0.760612
    bar    a       0.086825  0.118280  0.592211
    

    which doesn't work at all, or this:

                  0         1         2
    foo a  0.859630  0.399901  0.052504
        b  0.231838  0.863228  0.017451
    bar a  0.422231  0.307960  0.801993
    

    Which works, but returns something totally incorrect!