Search code examples
pythonscipysparse-matrix

Recovering explicit zeros from Scipy MST


I have a graph with edges of weight zero and want to compute the MST. Suppose that the graph is as follows:

input graph             minimum spanning tree

     (0)                         (0)
    / | \                       / | \
   2  |  3                     2  |  3
  /   |   \                   /   |   \
(3)----5--(1)               (3)   |    (1)
  \   |    /                      0     
   2  0   3                       |    
    \ | /                         |  
     (2)                         (2)


If I compute that on Scipy, it correctly computes the MST, but is didn't store the (0,2) - 0 edge. Any hints about how can I recover this information from the output MST?

my code:

from scipy.sparse import csr_matrix
from scipy.sparse.csgraph import minimum_spanning_tree
X = csr_matrix([[0, 3, 0, 2],
                [3, 0, 3, 5],
                [0, 3, 0, 2],
                [2, 5, 2, 0]])
X[0,2] = 0
X[2,0] = 0
Tcsr = minimum_spanning_tree(X)
print(Tcsr)
print(f'non-zero entries: {Tcsr.nnz}')

output:

(0, 1)  3.0
(0, 3)  2.0
non-zero entries: 2

Solution

  • One approach would be to avoid giving any zeros to minimum_spanning_tree, by adding one to every edge, and subtracting one from every edge afterward. (Assuming you have no negative edges.)

    Why is this correct? The idea is that every minimum spanning tree of a graph with n nodes has n - 1 edges. If 1 is added to every edge, then the total cost of the minimum spanning tree increases by n - 1. But that cost is added to every possible spanning tree, so the relative order of spanning trees is the same. Therefore, the minimum spanning tree of the graph plus one is the same as the minimum spanning tree of the graph.

    Example:

    from scipy.sparse import csr_matrix
    from scipy.sparse.csgraph import minimum_spanning_tree
    X = csr_matrix([[0, 3, 0, 2],
                    [3, 0, 3, 5],
                    [0, 3, 0, 2],
                    [2, 5, 2, 0]])
    X[0,2] = 0
    X[2,0] = 0
    # Note: X is assumed to be a CSR or CSC matrix
    X.data += 1
    Tcsr = minimum_spanning_tree(X)
    Tcsr.data -= 1
    print(Tcsr)
    print(f'non-zero entries: {Tcsr.nnz}')
    

    There is a more thorough explanation of the proof here: Questions on shortest path and minimum spanning tree