I have two solutions to this problem actually, they are both applied below to a test case. The thing is that none of them is perfect: first one only take into account the two end points, the other one can't be made "arbitrarily smooth": there is a limit in the amount of smoothness one can achieve (the one I am showing). I am sure there is a better solution, that kind-of go from the first solution to the other and all the way to no smoothing at all. It may already be implemented somewhere. Maybe solving a minimization problem with an arbitrary number of splines equidistributed?
Thank you very much for your help
Ps: the seed used is a challenging one
import matplotlib.pyplot as plt
from scipy import interpolate
from scipy.signal import savgol_filter
import numpy as np
import random
def scipy_bspline(cv, n=100, degree=3):
""" Calculate n samples on a bspline
cv : Array ov control vertices
n : Number of samples to return
degree: Curve degree
"""
cv = np.asarray(cv)
count = cv.shape[0]
degree = np.clip(degree,1,count-1)
kv = np.clip(np.arange(count+degree+1)-degree,0,count-degree)
# Return samples
max_param = count - (degree * (1-periodic))
spl = interpolate.BSpline(kv, cv, degree)
return spl(np.linspace(0,max_param,n))
def round_up_to_odd(f):
return np.int(np.ceil(f / 2.) * 2 + 1)
def generateRandomSignal(n=1000, seed=None):
"""
Parameters
----------
n : integer, optional
Number of points in the signal. The default is 1000.
Returns
-------
sig : numpy array
"""
np.random.seed(seed)
print("Seed was:", seed)
steps = np.random.choice(a=[-1, 0, 1], size=(n-1))
roughSig = np.concatenate([np.array([0]), steps]).cumsum(0)
sig = savgol_filter(roughSig, round_up_to_odd(n/10), 6)
return sig
# Generate a random signal to illustrate my point
n = 1000
t = np.linspace(0, 10, n)
seed = 45136. # Challenging seed
sig = generateRandomSignal(n=1000, seed=seed)
sigInit = np.copy(sig)
# Add noise to the signal
mean = 0
std = sig.max()/3.0
num_samples = n/5
idxMin = n/2-100
idxMax = idxMin + num_samples
tCut = t[idxMin+1:idxMax]
noise = np.random.normal(mean, std, size=num_samples-1) + 2*std*np.sin(2.0*np.pi*tCut/0.4)
sig[idxMin+1:idxMax] += noise
# Define filtering range enclosing the noisy area of the signal
idxMin -= 20
idxMax += 20
# Extreme filtering solution
# Spline between first and last points, the points in between have no influence
sigTrim = np.delete(sig, np.arange(idxMin,idxMax))
tTrim = np.delete(t, np.arange(idxMin,idxMax))
f = interpolate.interp1d(tTrim, sigTrim, kind='quadratic')
sigSmooth1 = f(t)
# My attempt. Not bad but not perfect because there is a limit in the maximum
# amount of smoothing we can add (degree=len(tSlice) is the maximum)
# If I could do degree=10*len(tSlice) and converging to the first solution
# I would be done!
sigSlice = sig[idxMin:idxMax]
tSlice = t[idxMin:idxMax]
cv = np.stack((tSlice, sigSlice)).T
p = scipy_bspline(cv, n=len(tSlice), degree=len(tSlice))
tSlice = p.T[0]
sigSliceSmooth = p.T[1]
sigSmooth2 = np.copy(sig)
sigSmooth2[idxMin:idxMax] = sigSliceSmooth
# Plot
plt.figure()
plt.plot(t, sig, label="Signal")
plt.plot(t, sigSmooth1, label="Solution 1")
plt.plot(t, sigSmooth2, label="Solution 2")
plt.plot(t[idxMin:idxMax], sigInit[idxMin:idxMax], label="What I'd want (kind of, smoother will be even better actually)")
plt.plot([t[idxMin],t[idxMax]], [sig[idxMin],sig[idxMax]],"o")
plt.legend()
plt.show()
sys.exit()
Yes, a minimization is a good way to approach this smoothing problem.
Here is a suggestion for a least squares formulation: let s[0], ..., s[N] denote the N+1 samples of the given signal to smooth, and let L and R be the desired slopes to preserve at the left and right endpoints. Find the smoothed signal u[0], ..., u[N] as the minimizer of
min_u (1/2) sum_n (u[n] - s[n])² + (λ/2) sum_n (u[n+1] - 2 u[n] + u[n-1])²
subject to
s[0] = u[0], s[N] = u[N] (value constraints),
L = u[1] - u[0], R = u[N] - u[N-1] (slope constraints),
where in the minimization objective, the sums are over n = 1, ..., N-1 and λ is a positive parameter controlling the smoothing strength. The first term tries to keep the solution close to the original signal, and the second term penalizes u for bending to encourage a smooth solution.
The slope constraints require that u[1] = L + u[0] = L + s[0] and u[N-1] = u[N] - R = s[N] - R. So we can consider the minimization as over only the interior samples u[2], ..., u[N-2].
The minimizer satisfies the Euler–Lagrange equations
(u[n] - s[n]) / λ + (u[n+2] - 4 u[n+1] + 6 u[n] - 4 u[n-1] + u[n-2]) = 0
for n = 2, ..., N-2.
An easy way to find an approximate solution is by gradient descent: initialize u = np.copy(s)
, set u[1] = L + s[0] and u[N-1] = s[N] - R, and do 100 iterations or so of
u[2:-2] -= (0.05 / λ) * (u - s)[2:-2] + np.convolve(u, [1, -4, 6, -4, 1])[4:-4]
But with some more work, it is possible to do better than this by solving the E–L equations directly. For each n, move the known quantities to the right-hand side: s[n] and also the endpoints u[0] = s[0], u[1] = L + s[0], u[N-1] = s[N] - R, u[N] = s[N]. The you will have a linear system "A u = b", and matrix A has rows like
0, ..., 0, 1, -4, (6 + 1/λ), -4, 1, 0, ..., 0.
Finally, solve the linear system to find the smoothed signal u. You could use numpy.linalg.solve
to do this if N is not too large, or if N is large, try an iterative method like conjugate gradients.