I'm playing around with audio signal processing. For that purpose, I'm using a single channel wave file from here.
My first experiment was to look at the auto-correlation of the signal as done by the following code.
from scipy.io import wavfile
from scipy import signal
import numpy as np
sample_rate_a, data_a = wavfile.read('sounds/CantinaBand3.wav')
corr = signal.correlate(data_a, data_a)
lags = signal.correlation_lags(len(data_a), len(data_a))
corr = corr / np.max(corr)
lag = lags[np.argmax(corr)]
print(lag, np.max(corr))
Given that it is an auto-correlation, I would have expected to see a peak of 1.0 at lag 0 (since I normalize the correlation matrix).
However, the program outputs a peak of 1.0 at lag -36141. At lag 0, the correlation is -0.3526826588536898.
Currently, I do not have any explanation for this behavior. Is there an error in my calculation of the correlation or the lags?
When you read your wav file data_a
is an int16
array. The correlation will be calculated using int16
data type as well, this will cause a lot of overflow. To get a meaningful result you can convert it to a floating point array.
from scipy.io import wavfile
from scipy import signal
import numpy as np
_, data_a = wavfile.read('./CantinaBand3.wav')
data_a = np.float32(data_a)
corr = signal.correlate(data_a, data_a)
lags = signal.correlation_lags(len(data_a), len(data_a))
corr = corr / np.max(corr)
lag = lags[np.argmax(corr)]
print(lag, np.max(corr))