I have a dataframe that goes like
datetime cnt
1 2015-01-04 00:00:00 102 datetime cnt
2 2015-01-04 01:00:00 23 Turns into 1 2015-01-04 170
3 2015-01-04 02:00:00 45
datetime = datetime, cnt = bicycles rent this day in this hour I want to turn such 3 indexes in 1 index that has datetime without hours and summed up cnt.
And you can only merge indexes that happen the same day, but different hours
path2 = r'C:\Users\bossd\OneDrive\Документы\zhopa123.csv'
df2 = pd.read_csv(path2)
col2015 = df2.loc[df2['year'] == 2015]
a = col2015['cnt']
b = col2015['timestamp']
I tried this algorithm to merge two indexes and sum up their cnt values, but it didn't work
for k in len(df2+1):
if (df2.loc[k, 'timestamp']).date() == (df2.loc[k+1, 'timestamp']).date():
df2.loc[df2.index[k], 'cnt'] + df2.loc[df2.index[k+1], 'cnt']
df2.drop(df2.index[k+1])
elif (df2.loc[k, 'timestamp']).date() != (df2.loc[k+1, 'timestamp']).date():
pass
I got an error TypeError: can only concatenate str (not "int") to str
IIUC, you can use .groupby
:
df = df.groupby(df["datetime"].dt.date)["cnt"].sum().reset_index()
print(df)
Prints:
datetime cnt
0 2015-01-04 170
EDIT:
.dt.date
returns the date portion from datetime Series.
Series.reset_index()
creates a dataframe from the Series (the index was the the date, now it's a column and the dataframe has new index - starting from 0
)