I'm trying to aggregate my min & max temperatures of 2 different states across a year. The columns in my dataframe are Date, Name, Tmax, Tmin.
However, when I try to use:
df['Year'], df['Month-Date'] = zip(*df['Date'].apply(lambda x: (x[:4], x[5:])))
it returns a Key Error
using df.dtypes returns:
NAME object
TMAX float64
TMIN float64
dtype: object
So although my dataframe clearly shows a Date column, it's not in my list of columns. When I set my index to Date prior to this, there were no errors. Any ideas on what I'm doing wrong?
It seems you've set Date
to be your index, so, naturally, it doesn't show up as one of the columns. You'd refer to it using df.index
now.
Furthermore, I don't recommend string operations on datetime
data. Use the accessor and extract the date components that you want. If it isn't in datetime
format already, use pd.to_datetime
and convert it.
# don't run this line if the index is a DateTimeIndex already
y = pd.to_datetime(df.index, errors='coerce')
df['Year'], df['Month-Date'] = y.year, y.month