I want to use a median filter for smoothing a signal, I see there are two methods in Python which can be used:
medfilt
from scipy.signal
DataFrame.rolling().median()
from pandas
By selecting the same window size for these two methods I get different results. I have attached an example data set. Furthermore in the second method the number of data points are changing when the filter is applied (according to the window size) which I expect that to happen, however in the second method the number of smoothed data are the same as the original data.
What is the difference between these two methods and why are different results obtained?
import pandas as pd
import scipy.signal as ss
signal = [4, 3.8, 3.75, 3.9, 3.53, 3.26, 2.33, 2.8, 2.5, 2.4, 2, 2.2, 1.5, 1.7]
# First method
SmoothedSignal = ss.medfilt(signal, kernel_size=5)
print(SmoothedSignal)
print(len(SmoothedSignal))
# Second method
signal = pd.DataFrame(signal)
RollingMedian = signal.rolling(5).median()
print(RollingMedian)
print(len(RollingMedian))
The cause of the differing median values is the alignment of the kernel. pandas.DataFrame.rolling
right aligns the kernel by default, while scipy.signal.medfit
center aligns its kernal by default.
You can center align the DataFrame.rolling
by setting the center
keyword argument to True
.
import pandas as pd
import scipy.signal as ss
signal = [4, 3.8, 3.75, 3.9, 3.53, 3.26, 2.33, 2.8, 2.5, 2.4, 2, 2.2, 1.5, 1.7]
# scipy
size = len(signal)
smoothed = ss.medfilt(signal, kernel_size=5)
# rolling - right aligned
signal = pd.DataFrame(signal)
rolling_right = signal.rolling(5).median()
# rolling - center aligned
signal = pd.DataFrame(signal)
rolling_center = signal.rolling(5 ,center = True).median()
df = pd.DataFrame()
df[ 'smooth' ] = smoothed
df[ 'rolling_center' ] = rolling_center
df[ 'rolling_right' ] = rolling_right
df
# output
smooth rolling_center rolling_right
0 3.75 NaN NaN
1 3.80 NaN NaN
2 3.80 3.80 NaN
3 3.75 3.75 NaN
4 3.53 3.53 3.80
5 3.26 3.26 3.75
6 2.80 2.80 3.53
7 2.50 2.50 3.26
8 2.40 2.40 2.80
9 2.40 2.40 2.50
10 2.20 2.20 2.40
11 2.00 2.00 2.40
12 1.70 NaN 2.20
13 1.50 NaN 2.00
You'll also notice the differences in the nan
filling from using rolling
.