my dataframe is as follows and I would like to rearrange, update and save it to csv or similar format.
time lat lon ws wd
1/1/2023 0:00 -5 107 8.214895 313.9049
1/1/2023 0:00 -5 107.25 8.351197 314.44873
1/1/2023 0:00 -5.25 107 7.6247864 307.61456
1/1/2023 1:00 -5 107 8.084728 310.0435
1/1/2023 1:00 -5 107.25 8.091071 308.73547
1/1/2023 1:00 -5.25 107 7.4219675 302.27475
1/1/2023 2:00 -5 107.25 7.8656287 304.08533
1/1/2023 2:00 -5 107.5 8.087259 300.91684
1/1/2023 2:00 -5.25 107 7.4953856 300.94644
Desired df:
time ws wd ws wd ws wd
0 8.214895 313.9049 8.351197 314.44873 7.6247864 307.61456
3600 8.084728 310.0435 8.091071 308.73547 7.4219675 302.27475
7200 7.8656287 304.08533 8.087259 300.91684 7.4953856 300.94644
Group all ws wd in one row for each time step. One time step is 1h (3600s), 24 rows for one day, and the next day will be 25h, 26h (90000s, 93600s and so on).
I have tried stack, unstack, pivot, but may be missing something. I would appreciate your sugesstion. Thank you.
Convert column time
to datetimes and subtract minimal value for timedeltas, for seconds use Series.dt.total_seconds
, then create counter by GroupBy.cumcount
and pivot by DataFrame.pivot
, last sorting output with flatten MultiIndex
for prevent duplicated columns names:
times = pd.to_datetime(df['time'])
df['time'] = times.sub(times.min()).dt.total_seconds().astype(int)
out = (df.assign(g = df.groupby('time').cumcount())
.pivot(index='time', values=['ws','wd'], columns='g')
.sort_index(level=[1, 0], axis=1, ascending=[True, False]))
out.columns = out.columns.map(lambda x: f'{x[0]}_{x[1]}')
print (out)
ws_0 wd_0 ws_1 wd_1 ws_2 wd_2
time
0 8.214895 313.90490 8.351197 314.44873 7.624786 307.61456
3600 8.084728 310.04350 8.091071 308.73547 7.421968 302.27475
7200 7.865629 304.08533 8.087259 300.91684 7.495386 300.94644