Search code examples
pandastime-seriesrolling-computationpeak-detection

python/pandas time series: fast attack/slow decay; peak detection with decay


I would like to implement a "fast attack / slow decay" (peak detect with exponential decay) filter on a time series ts (a column in a pandas dataframe), described as follows:

fasd[t] = max(ts[t], 0.9 * fasd[t-1])

The "basic" code (below) works, but is there pythonic and efficient way to do it, using rolling() or vectorized methods ? Thanks.

import pandas as pd
ts = [1,0,0,0,0,1,0,0,0,1,0.95,1,1,1,1,1,0,0,1,1,1,1,1,1,]
df = pd.DataFrame({'ts':ts})
df['fasd'] = 0
df.loc[0,'fasd'] = df.iloc[0]['ts']
for i in range(1, len(df)):
    df.loc[i, 'fasd'] = max(df.loc[i,'ts'], 0.9*df.loc[i-1, 'fasd'])

Solution

  • Using numpy is more efficient:

    from time import time
    import pandas as pd
    
    ts = [1,0,0,0,0,1,0,0,0,1,0.95,1,1,1,1,1,0,0,1,1,1,1,1,1] * 1000  # artificially increasing the input size
    df = pd.DataFrame({'ts':ts})
    df['fasd'] = 0
    df.loc[0,'fasd'] = df.iloc[0]['ts']
    df2 = df.copy()
    
    t0 = time()
    for i in range(1, len(df)):
        df.loc[i, 'fasd'] = max(df.loc[i,'ts'], 0.9*df.loc[i-1, 'fasd'])
    t1 = time()
    print(f'Pandas version executed in {t1-t0} sec.')
    
    def fasd(array):
        for i in range(1, len(array)):
            array[i,1] = max(array[i,0], 0.9*array[i-1,1])
        return array
    
    t0 = time()
    df2 = pd.DataFrame(fasd(df2.to_numpy()))
    t1 = time()
    print(f'Numpy version executed in {t1-t0} sec.')
    

    Output:

    Pandas version executed in 3.0636708736419678 sec.
    Numpy version executed in 0.011569976806640625 sec.