I've written some Python code to emulate MATLABs xcorr
function for cross correlations:
def xcorr(x, y, scale='none'):
# Pad shorter array if signals are different lengths
if x.size > y.size:
pad_amount = x.size - y.size
y = np.append(y, np.repeat(0, pad_amount))
elif y.size > x.size:
pad_amount = y.size - x.size
x = np.append(x, np.repeat(0, pad_amount))
corr = np.correlate(x, y, mode='full') # scale = 'none'
lags = np.arange(-(x.size - 1), x.size)
if scale == 'biased':
corr = corr / x.size
elif scale == 'unbiased':
corr /= (x.size - abs(lags))
elif scale == 'coeff':
corr /= np.sqrt(np.dot(x, x) * np.dot(y, y))
I get the same values when comparing the values of the different scale types to MATLABs implementation, so this seems correct
One additional thing I'd like to add is the ability to normalize the cross correlation values so peaks don't exceed 1.0
, and valleys dont drop below -1.0
coeff
is already normalized so I'm not worried about that. However, the other scale types can exceed the -1/1 bounds.
I've tried a couple of things:
corr /= max(corr)
to the end of my function to normalize corr
regardless of which scale option is chosen. This keeps the upper bound in check, but I'm not sure if this correctly handles the lower boundcorr /= np.sqrt(np.dot(x, x) * np.dot(y, y))
to the end of my function for all options, but this seems to squash my values far away from 1.0Whats the correct way to normalize none
, biased
, and unbiased
scale options? MATLAB doesnt have the functionality for this, and Google doesnt turn up any results for normalization of biased/unbiased cross correlation estimates.
I’m confused. none
implies no normalization, and biased
and unbiased
imply the appropriate normalization so samples of the output correspond to the appropriate estimators. It doesn’t make sense to as “what normalization should I apply to a biased estimate of correlation so that it’s bounded to [-1, 1]” because then the estimate wouldn’t be a biased estimate any more, it’d be something else. The only estimator (among this bunch) that has this property is the correlation coefficient (the signal-processing-variant of Pearson’s coefficient), which is what coeff
corresponds to.
This implementation is fine as it is. Anyone seeking numbers in the [-1, 1] interval knows they should ask for the correlation coefficients via np.corrcoef()
.