I have a csv file like this:
2021-01-05 10:57:12.762000, REDDE EHZ AM 00, trigger
2021-01-05 10:58:26.622000, REDDE EHZ AM 00, trigger
2021-01-05 11:02:16.772000, REDDE EHZ AM 00, trigger
2021-01-05 11:02:34.042000, REDDE EHZ AM 00, trigger
2021-01-05 17:12:07.221999, REDDE EHZ AM 00, trigger
2021-01-06 01:42:45.501999, REDDE EHZ AM 00, trigger
2021-01-06 01:44:24.481999, REDDE EHZ AM 00, trigger
2021-01-06 01:44:58.051999, REDDE EHZ AM 00, trigger
2021-01-06 01:45:14.871999, REDDE EHZ AM 00, trigger
2021-01-06 01:47:10.901999, REDDE EHZ AM 00, trigger
2021-01-06 07:57:33.221999, REDDE EHZ AM 00, trigger
2021-01-06 07:57:48.821999, REDDE EHZ AM 00, trigger
2021-01-06 07:58:51.031999, REDDE EHZ AM 00, trigger
2021-01-06 07:59:27.001999, REDDE EHZ AM 00, trigger
2021-01-06 08:00:56.871999, REDDE EHZ AM 00, trigger
2021-01-06 11:28:17.191999, REDDE EHZ AM 00, trigger
2021-01-06 11:28:46.201999, REDDE EHZ AM 00, trigger
2021-01-06 11:29:19.111999, REDDE EHZ AM 00, trigger
2021-01-06 11:29:41.891999, REDDE EHZ AM 00, trigger
2021-01-06 11:30:51.901999, REDDE EHZ AM 00, trigger
2021-01-06 11:31:21.921999, REDDE EHZ AM 00, trigger
2021-01-06 11:32:23.001999, REDDE EHZ AM 00, trigger
2021-01-06 11:32:58.271999, REDDE EHZ AM 00, trigger
2021-01-07 11:33:46.891999, REDDE EHZ AM 00, trigger
2021-01-07 12:38:50.021999, REDDE EHZ AM 00, trigger
2021-01-07 12:39:53.881999, REDDE EHZ AM 00, trigger
2021-01-08 12:42:07.371999, REDDE EHZ AM 00, trigger
2021-01-08 12:42:46.441999, REDDE EHZ AM 00, trigger
2021-01-09 12:44:14.291999, REDDE EHZ AM 00, trigger
I added the header with:
df = pd.read_csv(r'D:\Inves\SM\CC_Cbba\REDPy\OSCREDDE_3_\redde_3_trigs.dat',
sep=',', header=None, usecols=[0, 1, 2])
headers = ["TrigDT", "Sta", "Type"]
The output is:
TrigDT Sta Type
0 2021-01-05 10:57:12.762000 REDDE EHZ AM 00 trigger
1 2021-01-05 10:58:26.622000 REDDE EHZ AM 00 trigger
2 2021-01-05 11:02:16.772000 REDDE EHZ AM 00 trigger
3 2021-01-05 11:02:34.042000 REDDE EHZ AM 00 trigger
4 2021-01-05 17:12:07.221999 REDDE EHZ AM 00 trigger
...
I created a Date colum in roder to try to group the information by days:
df['TrigDT'] = pd.to_datetime(df['TrigDT'])
df['Date'] = df['TrigDT'].dt.date
I tried to have a cumulative sum with Index because I do not have a colum with counter of events, then I tried to group by days but I fail:
df = df.groupby('Date').index.sum()
df = df.groupby(df.index.day).cumsum().reset_index()
The idea is to create a cumulative plot with the DataFrame information (X axis with dates a Y axis with cumulative information), I tried to have a plot like https://stackoverflow.com/questions/53895480/python-plot-timedelta-and-cumulative-values
Would you mind to give me some tips to reach the objective?, the expected output could be like this, in my case only with one station that is called REDDE:
Not sure I understood what a
, b
, c
would be exactly in your chart, but here is how you can accumulate the number of triggers daily:
# Load the data and set 'TrigDT' as the index
df = pd.read_csv('path_to_file.csv', header=None, usecols=(0, 1, 2), names=('TrigDT', 'Sta', 'Type'))
df.TrigDT = pd.to_datetime(df.TrigDT)
df.set_index('TrigDT', inplace=True)
# Resample to daily and count
daily_acc = df.resample('d').count().cumsum()
daily_acc.plot()