Search code examples
pythonnumpymultidimensional-arraysparse-matrixsimplify

simplity construction of sparse (transition) matrix


I am constructing a transition matrix from a n1 x n2 x ... x nN x nN array. For concreteness let N = 3, e.g.,

import numpy as np

# example with N = 3
n1, n2, n3 = 3, 2, 5
dim = (n1, n2, n3)
arr = np.random.random_sample(dim + (n3,))

Here arr contains transition probabilities between 2 states, where the "from"-state is indexed by the first 3 dimensions, and the "to"-state is indexed by the first 2 and the last dimension. I want to construct a transition matrix, which expresses these probabilities raveled into a sparse (n1*n2*n3) x (n1*n2*n3 matrix.

To clarify, let me provide my current approach that does what I want to do. Unfortunately, it's slow and doesn't work when N and n1, n2, ... are large. So I am looking for a more efficient way of doing the same that scales better for larger problems.

My approach

import numpy as np
from scipy import sparse as sparse

## step 1: get the index correponding to each dimension of the from and to state

# ravel axes 1 to 3 into single axis and make sparse 
spmat = sparse.coo_matrix(arr.reshape(np.prod(dim), -1))
data = spmat.data
row = spmat.row
col = spmat.col

# use unravel to get idx for 
row_unravel = np.array(np.unravel_index(row, dim))
col_unravel = np.array(np.unravel_index(col, n3))

## step 2: combine "to" index with rows 1 and 2 of "from"-index to get "to"-coordinates in full state space

row_unravel[-1, :] = col_unravel # first 2 dimensions of state do not change
colnew = np.ravel_multi_index(row_unravel, dim) # ravel back to 1d

## step 3: assemble transition matrix

out = sparse.coo_matrix((data, (row, colnew)), shape=(np.prod(dim), np.prod(dim)))

Final thought

I will be running this code many times. Across iterations, the data of arr may change, but the dimensions will stay the same. So one thing I could do is to save and load row and colnew from a file, skipping everything between the definition of data (line 2) and final line of my code. Do you think this would be the best approach?

Edit: One problem I see with this strategy is that if some elements of arr are zero (which is possible) then the size of data will change across iterations.


Solution

  • One approach that beats the one posted in the OP. Not sure if it's the most efficient.

    import numpy as np
    from scipy import sparse
    
    # get col and row indices
    idx = np.arange(np.prod(dim))
    row = idx.repeat(dim[-1])
    col = idx.reshape(-1, dim[-1]).repeat(dim[-1], axis=0).ravel()
    
    # get the data
    data = arr.ravel()
    
    # construct the sparse matrix
    out = sparse.coo_matrix((data, (row, col)), shape=(np.prod(dim), np.prod(dim)))
    

    Two things that could be improved:

    (1) if arr is sparse, the output matrix out will have zeros coded as nonzero.

    (2) The approach relies on the new state being the last dimension of dim. It would be nice to generalize so that the last axis of arr can replace any of the originating axis, not just the last one.