Search code examples
pythonnumpyscipy

How to "smooth" a discrete/stepped signal in a vectorized way with numpy/scipy?


I have a signal like the orange one in the following plot that can only have integer values: enter image description here

As you can see, the orange signal in a bit noisy and "waffles" between levels sometimes when its about to change to a new steady state. I'd like to "smooth" this effect and achieve the blue signal. The blue signal is the orange one filtered such that the transitions don't occur until 3 samples in a row have made the jump to the next step. This is pretty easy if I loop through each sample manually and use a couple state variables to track how many times in a row I've jumped to a new step, but its also slow. I'd like to find a way to vectorize this in numpy. Any ideas?

Here's an example of the non-vectorized way that seems to do what I want:

up_count = 0
dn_count = 0
out = x.copy()
for i in range(len(out)-1):
    if out[i+1] > out[i]:
        up_count += 1
        dn_count = 0

        if up_count == 3:
            up_count = 0
            out[i+1] = out[i+1]
        else:
            out[i+1] = out[i]
    elif out[i+1] < out[i]:
        up_count = 0
        dn_count += 1

        if dn_count == 3:
            dn_count = 0
            out[i+1] = out[i+1]
        else:
            out[i+1] = out[i]
    else:
        dn_count = 0
        up_count = 0

EDIT:
Thanks to @Bogdan Shevchenko for this solution. I already have numpy and scipy available, so rather than get pandas involved here's my numpy/scipy version of his answer:

def ffill(arr, mask):
    idx = np.where(~mask, np.arange(mask.shape[0])[:, None], 0)
    np.maximum.accumulate(idx, axis=0, out=idx)
    return arr[idx, np.arange(idx.shape[1])]

x_max = scipy.ndimage.maximum_filter1d(x, 3, axis=0, origin=1, mode="nearest")
x_min = scipy.ndimage.minimum_filter1d(x, 3, axis=0, origin=1, mode="nearest")
x_smooth = ffill(x, x_max!=x_min)

Solution

  • There could be a lot of different strategies, based on DataFrame.rolling processing. In example:

    t = pd.Series([
        1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 
        1, 2, 1, 2, 1, 1, 2, 2, 3, 1, 2, 
        2, 1, 1, 2, 1, 1, 1, 1, 2, 2, 2, 
        2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])
    min_at_row = 3
    
    t_smoothed = t.copy()
    t_smoothed[
        t.rolling(min_at_row, min_periods=1).max() != 
        t.rolling(min_at_row, min_periods=1).min()
    ] = None
    t_smoothed = t_smoothed.ffill()
    

    Result looks like what you want to obtain: enter image description here

    If you want to stay with numpy only, without pandas, use np.lib.stride_tricks.sliding_window_view(t, min_at_row).max(axis=1) instead of t.rolling(min_at_row).max() (and analogously with min), but be aware that it will return array without first (min_at_row - 1) values

    You can also use t_smoothed.shift(-min_at_row + 1) to remove delay of smoothened signal.