Search code examples
pythonkeyerror

Getting KeyError when setting dataset column to variable with pandas


I am attempting to set a dataset column (imported from csv) to a new variable in order to plot data in a histogram:

x = gcbs[gcbs['E1']]   #dataset is gcbs, column is 'E1'

plt.hist(x, bins = 9)

I receive the following error:

KeyError: "None of [Int64Index([...,\n dtype='int64', length=2495)] are in the [columns]"

Why am I getting this error?


Solution

  • If you want to use filtering with a boolean-mask, you need to use loc, and if you just want a column you get it by using [] e.g

    df = pd.DataFrame({"x":[1,2,3,4],"y":[10,20,30,40]})
    x = df["x"] #get the column "x"
    
    df_gt_2 = df.loc[df["x"]>2] #Get the rows where "x" is greather than 2
    

    so if I read your question correctly you just need to do the first part

    import matplotlib.pyplot as plt
    
    x = gcbs["x"]
    plt.hist(x,bins=9)