Search code examples
pythonmatlabsignal-processingnormalizationcorrelation

Python normalizing 1D cross correlation


I've written some Python code to emulate MATLABs xcorr function for cross correlations:

def xcorr(x, y, scale='none'):
    # Pad shorter array if signals are different lengths
    if x.size > y.size:
        pad_amount = x.size - y.size
        y = np.append(y, np.repeat(0, pad_amount))
    elif y.size > x.size:
        pad_amount = y.size - x.size
        x = np.append(x, np.repeat(0, pad_amount))

    corr = np.correlate(x, y, mode='full')  # scale = 'none'
    lags = np.arange(-(x.size - 1), x.size)

    if scale == 'biased':
        corr = corr / x.size
    elif scale == 'unbiased':
        corr /= (x.size - abs(lags))
    elif scale == 'coeff':
        corr /= np.sqrt(np.dot(x, x) * np.dot(y, y))

I get the same values when comparing the values of the different scale types to MATLABs implementation, so this seems correct

One additional thing I'd like to add is the ability to normalize the cross correlation values so peaks don't exceed 1.0, and valleys dont drop below -1.0

coeff is already normalized so I'm not worried about that. However, the other scale types can exceed the -1/1 bounds.

I've tried a couple of things:

  1. Adding corr /= max(corr) to the end of my function to normalize corr regardless of which scale option is chosen. This keeps the upper bound in check, but I'm not sure if this correctly handles the lower bound
  2. Adding corr /= np.sqrt(np.dot(x, x) * np.dot(y, y)) to the end of my function for all options, but this seems to squash my values far away from 1.0

Whats the correct way to normalize none, biased, and unbiased scale options? MATLAB doesnt have the functionality for this, and Google doesnt turn up any results for normalization of biased/unbiased cross correlation estimates.


Solution

  • I’m confused. none implies no normalization, and biased and unbiased imply the appropriate normalization so samples of the output correspond to the appropriate estimators. It doesn’t make sense to as “what normalization should I apply to a biased estimate of correlation so that it’s bounded to [-1, 1]” because then the estimate wouldn’t be a biased estimate any more, it’d be something else. The only estimator (among this bunch) that has this property is the correlation coefficient (the signal-processing-variant of Pearson’s coefficient), which is what coeff corresponds to.

    This implementation is fine as it is. Anyone seeking numbers in the [-1, 1] interval knows they should ask for the correlation coefficients via np.corrcoef().