python-3.x numpy scipy signal-processing interpolation

Find all positive-going zero-crossings in a large quasi-periodic array

I need to find zero-crossings in a 1D array of a roughly periodic function. It will be the points where an orbiting satellite crosses the Earth's equator going north.

I've worked out a simple solution based on finding points where one value is zero or negative and the next is positive, then using a quadratic or cubic interpolator with scipy.optimize.brentq to find the nearby zeros.

The interpolator does not go beyond cubic, and before I learn to use a better interpolator I'd first like to check if there already exists a fast method in numpy or scipy to find all of the zero crossings in a large array (n = 1E+06 to 1E+09).

Question: So I'm asking does there already exist a faster method in numpy or scipy to find all of the zero crossings in a large array (n = 1E+06 to 1E+09) than the way I've done it here?

The plot shows the errors between the interpolated zeros and the actual value of the function, the smaller line is the cubic interpolation, the larger is quadratic.

import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import interp1d
from scipy.optimize import brentq

def f(x):
    return np.sin(x + np.sin(x*e)/e) # roughly periodic function

halfpi, pi, twopi = [f*np.pi for f in (0.5, 1, 2)]
e = np.exp(1)

x = np.arange(0, 10000, 0.1)
y = np.sin(x + np.sin(x*e)/e)
crossings = np.where((y[1:] > 0) * (y[:-1] <= 0))[0]

Qd = interp1d(x, y, kind='quadratic', assume_sorted=True)
Cu = interp1d(x, y, kind='cubic', assume_sorted=True)

x0sQd = [brentq(Qd, x[i-1], x[i+1]) for i in crossings[1:-1]]
x0sCu = [brentq(Cu, x[i-1], x[i+1]) for i in crossings[1:-1]]

y0sQd = [f(x0) for x0 in x0sQd]
y0sCu = [f(x0) for x0 in x0sCu]

if True:
    plt.figure()
    plt.plot(x0sQd, y0sQd)
    plt.plot(x0sCu, y0sCu)
    plt.show()

Solution

Question: So I'm asking does there already exist a faster method in numpy or scipy to find all of the zero crossings in a large array (n = 1E+06 to 1E+09) than the way I've done it here?

By "find all of the zero crossings in a large array", I assume you mean "approximate the roots of a function underlying data in a large array" (as you've done by finding the roots of an interpolating spline). In that case, yes.

Instead of using the interp1d convenience function, use the object-oriented interface of an interpolator (e.g. CubicSpline, and use its roots method to find all roots.

After running your code:

from scipy.interpolate import CubicSpline
spline = CubicSpline(x, y, extrapolate=False)

# find all the roots
x0sCu2 = spline.roots()

# extract the ones with positive slope
dspline = spline.derivative()
slope = dspline(x0sCu2)
x0sCu2 = x0sCu2[slope > 0]

# confirm that the results agree with `brent`
np.testing.assert_allclose(x0sCu2[1:-1], x0sCu)

This is a bit faster than what you have.

%timeit [brentq(Cu, x[i-1], x[i+1]) for i in crossings[1:-1]]
# 866 ms ± 11.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

%timeit x0sCu2 = spline.roots(); dspline = spline.derivative(); slope = dspline(x0sCu2); x0sCu2 = x0sCu2[slope > 0]
# 264 ms ± 5.34 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

There are actually faster ways than the built-in roots method, though. You had the right idea, but you could have used a vectorized rootfinder, like newton.

from scipy.optimize import newton
x0sCu3 = newton(Cu, x[crossings[1:-1]])
np.testing.assert_allclose(x0sCu3, x0sCu)

# Ignores the time spent finding `crossings`, but that's
# negligible, and can be improved (e.g. we don't need `where`)
%timeit newton(Cu, x[crossings[1:-1]])
# 3.93 ms ± 563 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

It will almost certainly converge since you're able to provide a very good guess. However, the nice thing about brentq is it's essentially guaranteed to converge because you provide a bracket. There is vectorized bracketing rootfinder, but it's private, so don't count on it being available in the same place in future versions of SciPy.

from scipy.optimize._zeros_py import _chandrupatla
res = _chandrupatla(Cu, x[crossings[1:-1]-1], x[crossings[1:-1]+1])
np.testing.assert_allclose(res.x, x0sCu)

%timeit _chandrupatla(Cu, x[crossings[1:-1]-1], x[crossings[1:-1]+1])
# 7.54 ms ± 1.24 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)

It sounded like you were also interested in higher-order splines for improved accuracy. In that case, you can use InterpolatedUnivariateSpline with order k <= 5 or make_interp_spline with any odd order.

from scipy.interpolate import InterpolatedUnivariateSpline, make_interp_spline

Qr = InterpolatedUnivariateSpline(x, y, k=4)  # quartic
x0sQr = newton(Qr, x[crossings[1:-1]])
np.testing.assert_allclose(x0sQr, x0sCu, rtol=1e-6)

Qn = make_interp_spline(x, y, k=5)  # quintic
x0sQn = newton(Qn, x[crossings[1:-1]])
np.testing.assert_allclose(x0sQn, x0sCu, rtol=1e-6)

y0sQr = f(x0sQr)
y0sQn = f(x0sQn)

plt.figure()  # use a log-scale to resolve the difference
plt.semilogy(x0sQd, np.abs(y0sQd), label='quadratic')
plt.semilogy(x0sCu, np.abs(y0sCu), label='cubic')
plt.semilogy(x0sQr, np.abs(y0sQr), label='quartic')
plt.semilogy(x0sQn, np.abs(y0sQn), label='quintic')
plt.ylabel('residual')
plt.legend()
plt.show()

If by "find all of the zero crossings in a large array" you literally meant finding the zero crossings in the array, you've already bounded those in your code.

crossings = np.where((y[1:] > 0) * (y[:-1] <= 0))[0]

If you have the function available and all you're interested in is the roots, the approach might be different. That would be a separate question/answer, though.