I would like to perform a matrix balancing in python. I thought to use linalg.matrix_balance
in order to make the input matrix a doubly stochastic matrix:
from scipy import linalg
import numpy as np
x = np.array([[1,2,7], [9,1,1], [1,2,10*np.pi]])
y, permscale = linalg.matrix_balance(x)
np.abs(x).sum(axis=0) / np.abs(x).sum(axis=1)
np.abs(y).sum(axis=0) / np.abs(y).sum(axis=1)
permscale
array([[ 1., 0., 0.],
[ 0., 2., 0.],
[ 0., 0., 1.]])
It doesn't look that it actually balancing the rows and cols of the matrix. Any idea what should I do?
If I do: y, permscale = linalg.matrix_balance(x, scale=False)
The result isn't normalized either:
array([[ 1. , 2. , 7. ], [ 9. , 1. , 1. ], [ 1. , 2. , 31.41592654]])
You are correct in your analysis of the output, but mistaken in your consideration of what matrix_balance
does. From the docs,
The balanced matrix satisfies the following equality,
B = inv(T) * A * T (matrix products)
Which we can verify easily with your matrices,
>>>print(np.dot(np.dot(np.linalg.inv(permscale), x), permscale))
[[ 1. 4. 7. ]
[ 4.5 1. 0.5 ]
[ 1. 4. 31.41592654]]
Which is indeed y
. This implies matrix_balance
works as it claims to do. You claim,
It doesn't look that it actually balancing the rows and cols of the matrix. Any idea what should I do?
But this is not true: the L1 norms of this modified matrix are balanced, to within one order of magnitude of two, such that the precise orders of magnitude are reflected to the scaling matrix permscale
.
Is your aim instead to make your matrix doubly-stochastic, rather than (just) balanced? If so, you could look at e.g. this project. Following the guide presented there I found the following doubly-stochastic matrix for your data,
>>>print(sk.fit(x))
[[ 0.1385636 0.55482644 0.30660996]
[ 0.79518122 0.17688932 0.02792947]
[ 0.06695665 0.26810303 0.66494032]]
>>>print(sk.fit(x).sum(axis=0), sk.fit(x).sum(axis=1))
[ 1.00070147 0.99981878 0.99947975] [ 1. 1. 1.]
Which ought to be 'close enough' for most use-cases.