Search code examples
pythonpandastheory

Why some commands cause an error relative to the preceding command?


This is a pandas question.

Try to copy this in Jupyter Notebook:

In [1]: df = pd.DataFrame([[1, 2], [4, 5], [7, 8]],
        index=['cobra', 'viper', 'sidewinder'],
        columns=['max_speed', 'shield'])

        df
In [2]: df.pop('shield') # Return as series.
In [3]: pd.DataFrame(df.pop('shield')) # Return as DataFrame.

Then inverse it to the sequence of

In[1]
In[3]
In[2]

Why the 3rd Out[-] always cause an error?

I oftentimes encounter this kinds of error. Is this a cache issue? Redundancy? What is the reason why such error occurs?


Solution

  • I think your code generate expected error:

    df = pd.DataFrame([[1, 2], [4, 5], [7, 8]],
            index=['cobra', 'viper', 'sidewinder'],
            columns=['max_speed', 'shield'])
    

    Here DataFrame.pop take column shield from original Dataframe, create Series and drop from original:

    a = df.pop('shield') # Return as series.
    print (a)
    cobra         2
    viper         5
    sidewinder    8
    Name: shield, dtype: int64
    

    So no column in df after pop:

    print (df)
                max_speed
    cobra               1
    viper               4
    sidewinder          7
    

    So failed get column shield, because not exist in df:

    b = pd.DataFrame(df.pop('shield')) # Return as DataFrame.
    print (b)
    KeyError: 'shield'