I would like to correlate time series (540 months long) per/for each single grid box, looping over 2 latitudes x 6 longitudes = 12 grid boxes in total. I manage to get the correlation coefficient, but the p-value turns out to be "nan" even though they should not.
np.corrcoef(TheData1[gotdata,lt,ln],TheData2[gotdata,lt,ln])[0,1]
works fine to get the correlation coefficient, but not for the
p-value.slope,intercept,r_value,p_value,std_err=scipy.stats.linregress(TheData1[gotdata,lt,ln],TheData2[gotdata,lt,ln])
gives pvalue=nan.scipy.stats.pearsonr(TheData1[gotdata,lt,ln],TheData2[gotdata,lt,ln])
gives the error"ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()"
TheMDI stands for -1e30, missing data.
for lt in range(NLats): # range nlats goes through 0...36
for ln in range(NLons): # range nlons goes through 0...72
# Short way
# create a map pointing to the elements of the array that are non-missing
gotdata = np.where((TheData1[:,lt,ln] > TheMDI) & (TheData2[:,lt,ln] > TheMDI))
OutputData[lt,ln] = np.corrcoef(TheData1[gotdata,lt,ln]/np.std(TheData1[gotdata,lt,ln]),TheData2[gotdata,lt,ln]/np.std(TheData2[gotdata,lt,ln]))[0,1]
OutputDataSig[lt,ln]=scipy.stats.linregress(TheData1[gotdata,lt,ln],TheData2[gotdata,lt,ln])[3]
Solved by choosing the very first entry ([0]) of gotdata only:
gotdata = np.where((TheData1[:,lt,ln] > TheMDI) & (TheData2[:,lt,ln] > TheMDI))[0]
This makes scipy.stats.pearsonr giving out a p-value.