Search code examples
pythonpandaskeyerror

Cannot find key in imported dataframe despite the key being present


I have a panel dataset about the Internet Penetration Rate of 11 countries and I want to encode the Country values.

    Country   Year  Int Pen Rate %  GDP Per Capita  GDP Growth %
0  Australia  2014       84.000000     76864.54828      2.579017
1  Australia  2015       84.560515     77397.27020      2.152736
2  Australia  2016       86.540000     78278.37956      2.730548
3  Australia  2017       86.545049     78751.93511      2.282184
4  Australia  2018       90.000000     79813.73387      2.883045

Why do I get KeyError on Country when I use the following function?

series['Country'] = label_encoder.fit_transform(series['Country'])

I used sep=',' when I read the dataset.


Solution

  • The KeyError means that the column 'Country' is not in your DataFrame. This might be caused that the column does not exist at all in your DataFrame.

    Please ensure that the DataFrame you are trying to operate on (which you're calling 'series') actually has a column named 'Country'. You can check the column names of your DataFrame with the following command:

    print(series.columns)
    

    If 'Country' is not in the output of that command, then you need to determine the correct column name and use that in your code.

    Without more information about the format of your CSV file or DataFrame, it's difficult to give a more precise answer.