a = np.array([[1,2,4],[3,6,2],[3,4,7],[9,7,7],[6,3,1],[3,5,9]])
b = np.array([[4,5,2],[9,2,5],[1,5,6],[4,5,6],[1,2,6],[6,4,3]])
a = array([[1, 2, 4],
[3, 6, 2],
[3, 4, 7],
[9, 7, 7],
[6, 3, 1],
[3, 5, 9]])
b = array([[4, 5, 2],
[9, 2, 5],
[1, 5, 6],
[4, 5, 6],
[1, 2, 6],
[6, 4, 3]])
I would like to calculate the pearson correlation coefficient between the first row of a and first row of b, the second row of a and second row b and so on for each rows to follow.
desired out put should be 1D array:
array([__ , __ , __)
column wise i can do it as below:
corr = np.corrcoef(a.T, b.T).diagonal(a.shape[1])
Output:
array([-0.2324843 , -0.03631365, -0.18057878])
UPDATE
Though i accepted the answer below but there is this alternative solution to the question and also addresses zero division error issues:
def corr2_coeff(A, B):
# Rowwise mean of input arrays & subtract from input arrays themeselves
A_mA = A - A.mean(1)[:, None]
B_mB = B - B.mean(1)[:, None]
# Sum of squares across rows
ssA = (A_mA**2).sum(1)
ssB = (B_mB**2).sum(1)
deno = np.sqrt(np.dot(ssA[:, None],ssB[None])) + 0.00000000000001
# Finally get corr coeff
return np.dot(A_mA, B_mB.T) / deno
I would do the calculation presented in the Q as:
[np.corrcoef(x, y)[0,1] for x, y in zip(a.T, b.T)]
with the result:
[-0.23248430170889073, -0.03631365196012811, -0.18057877962865382]
The row-by_row correlations are then obtained by simply removing the transpose:
[np.corrcoef(x,y)[0,1] for x, y in zip(a, b)]
with the result
[-0.7857142857142855,
-0.661143091251952,
0.8170571691028832,
-0.8660254037844387,
-0.9011271137791659,
-0.9285714285714285]
If you want the solution using the approach in Q., it can be done using:
np.corrcoef(a, b).diagonal(a.shape[0])
OR
np.corrcoef(a.T, b.T, rowvar=False).diagonal(a.shape[0])