Search code examples
pythonpandasdatetimedataframekeyerror

KeyError thrown when trying to access named index


I'm trying to aggregate my min & max temperatures of 2 different states across a year. The columns in my dataframe are Date, Name, Tmax, Tmin.

However, when I try to use:

df['Year'], df['Month-Date'] = zip(*df['Date'].apply(lambda x: (x[:4], x[5:])))

it returns a Key Error

using df.dtypes returns:

NAME     object
TMAX    float64
TMIN    float64
dtype: object

So although my dataframe clearly shows a Date column, it's not in my list of columns. When I set my index to Date prior to this, there were no errors. Any ideas on what I'm doing wrong?


Solution

  • It seems you've set Date to be your index, so, naturally, it doesn't show up as one of the columns. You'd refer to it using df.index now.

    Furthermore, I don't recommend string operations on datetime data. Use the accessor and extract the date components that you want. If it isn't in datetime format already, use pd.to_datetime and convert it.

    # don't run this line if the index is a DateTimeIndex already
    y = pd.to_datetime(df.index, errors='coerce')                            
    df['Year'], df['Month-Date'] = y.year, y.month