I'm trying to use scipy's quadratic_assignment function but I can't understand how the output can describe an optimal solution. Here is a minimal example where I compare a small matrix to itself:
import numpy as np
from scipy.optimize import quadratic_assignment
# Parameters
n = 5
p = np.log(n)/n # Approx. 32%
# Definitions
A = np.random.rand(n,n)<p
# Quadratic assignment
res = quadratic_assignment(A, A)
print(res.col_ind)
and the results seem to be random assignments:
[3 0 1 4 2]
[3 2 4 1 0]
[3 2 1 0 4]
[4 3 1 0 2]
[2 3 0 1 4]
...
However, according to the docs col_ind
is supposed to be the Column indices corresponding to the best permutation found of the nodes of B. Since the input matrices are equal (B==A), I would thus expect the identity assignment [0 1 2 3 4]
to pop out. Changing n
to larger values does not help.
Is there something I am getting wrong?
The quadratic_assignment
function (approximately) solves two types of problem: quadratic assignment and graph matching. These are mathematically very similar: the difference is that quadratic assignment involves minimising an objective function, whereas graph matching involves maximising that same function. (Reference: scipy docs).
To distinguish between these two problems, the function accepts an option maximize
(passed as a key:value pair in the options
argument of the function). If not supplied, this defaults to False
(i.e. the solution corresponds to the minimum of the function, i.e. the quadratic assignment problem rather than the graph matching problem).
quadratic_assignment(A, B) # solves quadratic assignment
quadratic_assignment(A, B, options={'maximize': False}) # solves quadratic assignment
quadratic_assignment(A, B, options={'maximize': True}) # solves graph matching
Now, when you say "Since the input matrices are equal (B==A), I would thus expect the identity assignment [0 1 2 3 4] to pop out" – that's what would be expected for graph matching, i.e. the identity assignment is indeed the permutation of B that matches A better than any other (or strictly speaking: at least as well as any other).
So you would get your expected result by specifying options={'maximize': True}
so that you get the solution to the graph matching problem rather than the default quadratic assignment:
import numpy as np
from scipy.optimize import quadratic_assignment
# Parameters
n = 5
p = np.log(n)/n # Approx. 32%
# Definitions
A = np.random.rand(n,n)<p
# Graph matching
res = quadratic_assignment(A, A, options={'maximize': True})
print(res.col_ind)
When I run this, I get the expected [0 1 2 3 4]
around 97% of the time. It's not 100% because (a) the algorithm is approximate and "not guaranteed to be optimal" (per the docs) and (b) the way that you construct A
could create degenerate cases where there are multiple equally good solutions.