Search code examples
pythonpandasdataframetime-seriesruntime

Loop through time series data and collect specific timeframe window keeping the runtime O(N)


I am trying to loop through a time series data frame and for a specific time, I need to then go back 5 minutes and 10 minutes (need to make sure I also DO NOT over count the data because of multicollinearity) and check if a condition is met. Below is the code that I wrote, I would love for it to be in O(N) and not have to make two loops. I was thinking of saving the index somehow to save space but need help here.

Thanks in advance

Sorry this is not a great question


Solution

  • Does this do what you want:

    fillData.set_index('time', drop=True, inplace=True)
    condition = fillData.fill.eq(1)
    fillData['500 milli'] = (condition.rolling(pd.Timedelta('500ms'))
                                      .agg(any)
                                      .astype(int))
    fillData['6 minutes'] = (condition.rolling(pd.Timedelta('6m'))
                                      .agg(any)
                                      .astype(int))
    fillData['6 minutes'][fillData['500 milli'].eq(1)] = 0
    fillData.reset_index(drop=False, inplace=True)
    

    I'm not sure how fillData is sorted. My assumption is that the sorting is ascending (in time). Otherwise you have to reverse it.