I want to calculate the mse between two very large 2d arrays.
x1 = [1,2,3]
x2 = [1,3,5]
x3 = [1,5,9]
x = [x1,x2,x3]
y1 = [2,3,4]
y2 = [3,4,5]
y3 = [4,5,6]
y = [y1,y2,y3]
expected result is a vector of size 3:
[mse(x1,y1), mse(x2,y2), mse(x3,y3)]
As for now, I am using sklearn.metrics.mean_squared_error as such:
mses = list(map(mean_squared_error, x, y))
This takes extremely long time, as the real lengths of xi and yi are 115 and I have over a million vectors in x/y.
You can use numpy.
a = np.array(x) # your x
b = np.array(y) # your y
mses = ((a-b)**2).mean(axis=1)
If you want to use your x
and y
.
a = np.random.normal(size=(1000000,100))
b = np.random.normal(size=(1000000,100))
mses = ((a-b)**2).mean(axis=1)
With your specified matrix size (1 000 000 x 100) this takes less than a second on my machine.