I'm using python, but this is a generic question (more related to algorithms etc) and thus I skip some steps to get to the gist of the matter:
I generate a sine signal like this:
import math as m
signal = [m.sin(2*m.pi*1*(t/n-d)) for t in range(n)]
So a sine signal, normalized so, that frequency is 1, and time goes from 0 to 1 second (so basically a simple one cycle of sine wave). There is also a delay term d, that delays the signal (causes a phase shift). The n is only the number of samples
I also create another signal, with another delay. Let's say I use delay of 0 for the first signal, and delay of x for the second signal (I abbreviate previous for the sake of clarity):
signal1 = signal(delay=0)
signal2 = signal(delay=x)
and then I do a correlation:
from scipy import signal as sgn
corr11 = sgn.correlate(signal1, signal1, mode = 'full')
corr12 = sgn.correlate(signal1, signal2, mode = 'full')
I also know that the signal delay correlates to the maximum of the correlation point, so I take out two points:
import numpy as np
a1 = np.argmax(corr11)
a2 = np.argmax(corr12)
So I've found that correlation of signal with itself has the max peak in the middle of the correlation array (or plot/function). But the other peak is weird:
So the question is, how the delay d relates to the peak location after correlating the signals?
It seems like the delay is approximately equal to (a1 - a2) / n
. However I think the answer is somewhat distorted by the fact that a) you are only using a single period sine wave, and b) you are using a finite number of data points (obviously). To get a more accurate answer for the case of a single period sine wave, you'd probably want to get the mathematical definition of correlation and do the necessary integration with the correct limits (but I'm not sure SO is the correct place to ask for help with integration).
Here is a self-contained script which plots the signals and the correlations, which will hopefully provide some more intuition. NB: the approximation I gave above seems to become more accurate when you repeat the number of periods of the sine wave. For example, with 100 periods and 100000 data points, the approximation above (modified here as n_repeats * (a1 - a2) / n
) seems to become a lot more accurate.
import numpy as np
from scipy import signal
import matplotlib.pyplot as plt
# Set parameters
# x = 0.5
x = 0.28328
# x = 0.25
# x = 0.1
# n = 100000
# n_repeats = 100
n = 1000
n_repeats = 1
# Get correlations
t = np.linspace(0, n_repeats, n)
sin_delay = lambda delay: np.sin(2.0 * np.pi * (t - delay))
signal1 = sin_delay(delay=0)
signal2 = sin_delay(delay=x)
corr11 = signal.correlate(signal1, signal1, mode = 'full')
corr12 = signal.correlate(signal1, signal2, mode = 'full')
a1 = np.argmax(corr11)
a2 = np.argmax(corr12)
# Print output
print(a1, a2, x, n_repeats * (a1 - a2) / n)
# Make plots
plt.figure()
plt.plot(signal1, "r")
plt.plot(signal2, "b")
plt.title("Signals, delay = {:.3f}".format(x))
plt.legend(["Original signal", "Delayed signal"], loc="upper right")
plt.grid(True)
plt.savefig("Signals")
plt.figure()
plt.plot(corr11, "r")
plt.plot(corr12, "b")
plt.title("Correlations, delay = {:.3f}".format(x))
plt.legend(["Auto-correlation", "Cross-correlation"], loc="upper right")
plt.grid(True)
plt.savefig("Correlations")
n = 1000, n_repeats = 1
999 749 0.28328 0.25
n = 100000, n_repeats = 100
99999 99716 0.28328 0.283
n = 1000, n_repeats = 1