python numpy scipy signal-processing cross-correlation

Find signal or phase delay from cross correlation

I'm using python, but this is a generic question (more related to algorithms etc) and thus I skip some steps to get to the gist of the matter:

I generate a sine signal like this:

import math as m
signal = [m.sin(2*m.pi*1*(t/n-d)) for t in range(n)]

So a sine signal, normalized so, that frequency is 1, and time goes from 0 to 1 second (so basically a simple one cycle of sine wave). There is also a delay term d, that delays the signal (causes a phase shift). The n is only the number of samples

I also create another signal, with another delay. Let's say I use delay of 0 for the first signal, and delay of x for the second signal (I abbreviate previous for the sake of clarity):

signal1 = signal(delay=0)
signal2 = signal(delay=x)

and then I do a correlation:

from scipy import signal as sgn
corr11 = sgn.correlate(signal1, signal1, mode = 'full')
corr12 = sgn.correlate(signal1, signal2, mode = 'full')

I also know that the signal delay correlates to the maximum of the correlation point, so I take out two points:

import numpy as np

a1 = np.argmax(corr11)
a2 = np.argmax(corr12)

So I've found that correlation of signal with itself has the max peak in the middle of the correlation array (or plot/function). But the other peak is weird:

At delay 0 and 1: a2 is same as a1
At delay 0.5: distance of a2 from a1 is 0.5 of a1 (inverted signal)
At delay 0.28328: a2 is 0.75 of a1
At delay 0.1: a2 is 0.90888 of a1

So the question is, how the delay d relates to the peak location after correlating the signals?

Solution

It seems like the delay is approximately equal to (a1 - a2) / n. However I think the answer is somewhat distorted by the fact that a) you are only using a single period sine wave, and b) you are using a finite number of data points (obviously). To get a more accurate answer for the case of a single period sine wave, you'd probably want to get the mathematical definition of correlation and do the necessary integration with the correct limits (but I'm not sure SO is the correct place to ask for help with integration).

Here is a self-contained script which plots the signals and the correlations, which will hopefully provide some more intuition. NB: the approximation I gave above seems to become more accurate when you repeat the number of periods of the sine wave. For example, with 100 periods and 100000 data points, the approximation above (modified here as n_repeats * (a1 - a2) / n) seems to become a lot more accurate.

Script

import numpy as np
from scipy import signal
import matplotlib.pyplot as plt

# Set parameters

# x = 0.5
x = 0.28328
# x = 0.25
# x = 0.1
# n = 100000
# n_repeats = 100
n = 1000
n_repeats = 1

# Get correlations
t = np.linspace(0, n_repeats, n)

sin_delay = lambda delay: np.sin(2.0 * np.pi * (t - delay))

signal1 = sin_delay(delay=0)
signal2 = sin_delay(delay=x)

corr11 = signal.correlate(signal1, signal1, mode = 'full')
corr12 = signal.correlate(signal1, signal2, mode = 'full')

a1 = np.argmax(corr11)
a2 = np.argmax(corr12)

# Print output
print(a1, a2, x, n_repeats * (a1 - a2) / n)

# Make plots
plt.figure()
plt.plot(signal1, "r")
plt.plot(signal2, "b")
plt.title("Signals, delay = {:.3f}".format(x))
plt.legend(["Original signal", "Delayed signal"], loc="upper right")
plt.grid(True)
plt.savefig("Signals")
plt.figure()
plt.plot(corr11, "r")
plt.plot(corr12, "b")
plt.title("Correlations, delay = {:.3f}".format(x))
plt.legend(["Auto-correlation", "Cross-correlation"], loc="upper right")
plt.grid(True)
plt.savefig("Correlations")

Console output with `n = 1000, n_repeats = 1`

999 749 0.28328 0.25

Console output with `n = 100000, n_repeats = 100`

99999 99716 0.28328 0.283

Find signal or phase delay from cross correlation

Script

Console output with n = 1000, n_repeats = 1

Console output with n = 100000, n_repeats = 100

Output images with n = 1000, n_repeats = 1

Console output with `n = 1000, n_repeats = 1`

Console output with `n = 100000, n_repeats = 100`

Output images with `n = 1000, n_repeats = 1`