python numpy scipy linear-algebra sparse-matrix

Sparse matrix of variable movement (digits) between 2 same sized lists

I want to create a sparse matrix of the differences between the indexes of two 1D arrays or lists of digits. These two rows give us the positions at time 'a' and at a later time 'b'.

a = [6,3,10,2,5,7,4,11,8,9]
b = [10,3,6,5,11,2,7,8,9,4]

As you can see, '6' has moved from index 0 to index 2. '3' from index 1, to index 1. '10' from index 2, to index 0. '2' from index 3, to index 5. and so on...

I want to map this movement on to a sparse n*n matrix. Each row and column is in numerical order starting from 2 as per:

>>>sorted(a)
[2, 3, 4, 5, 6, 7, 8, 9, 10, 11]

The following is the end result I want (the sparse matrix of movement).

array([[ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
   [ 0.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.],
   [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
   [ 1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
   [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
   [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
   [ 0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
   [ 0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
   [ 0.,  1.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.],
   [ 1.,  0.,  1.,  0.,  0.,  1.,  0.,  0.,  0.,  0.]])

which is a representative of this graph I have drawn up: graph of positions

Whereby the first column is represented by list a and the second column represented by list b.

The pink highligher indicates a movement towards index 0 (upwards). The yellow highlighter indicates a movement downwards. No highlighter means no change in position.

This is what I have:

>>>import numpy as np
>>>sparse = np.zeros((len(a),len(a)))
>>>sparse.shape
(10, 10)
>>>a_unique = np.unique(np.array(b), return_index=True)
>>>b_unique = np.unique(np.array(b), return_index=True)
>>>a_unique
(array([ 2,  3,  4,  5,  6,  7,  8,  9, 10, 11]), array([3, 1, 6, 4, 0, 5, 8, 9, 2, 7]))
>>>b_unique
(array([ 2,  3,  4,  5,  6,  7,  8,  9, 10, 11]), array([5, 1, 9, 3, 2, 6, 7, 8, 0, 4]))

Now if we subtract b_unique from a_unique, this gives us the following:

>>> a_unique[1]-b_unique[1]
array([-2,  0, -3,  1, -2, -1,  1,  1,  2,  3])

^ A negative number is represented vertically (as a column) in the sparse matrix as positions given to other digits (i.e. the number has moved downwards from list a to list b, i.e. yellow highlighter).

^ A positive number is represented horizontally (as a row) in the sparse matrix as positions received from other digits (i.e. the number has moved upwards from list a to list b, i.e. pink highlighter).

I am not sure how to continue to solve this problem and hence why I need assistance.

Solution

I was able to solve this problem after sleeping on it.

def seq_movement(a,b):
    import numpy as np
    def counter(items):
    counts = dict()
    for i in items:
        counts[i] = counts.get(i, 0) + 1
    return counts

    def find_2(dic):
        return [k for k, v in dic.items() if v ==2]

    sort = sorted(a)
    sparse = np.zeros([len(sort)]*2)

    while sorted(a) == sorted(b):
        for i,j in enumerate(sort):
                ai = a.index(j)
                bi = b.index(j)
                abovea = a[:ai]
                belowb = b[bi:]
                receiving = find_2(counter(belowb + abovea)) #row
                for row_ele in receiving:
                    sparse[i][sort.index(row_ele)] = 1
                belowa = a[ai:]
                aboveb = b[:bi]
                giving = find_2(counter(belowa + aboveb)) #column
                for col_ele in giving:
                    sparse[:,i][sort.index(col_ele)] = 1
         break     
    return sparse

It takes input of a & b like in the example. First we want to make sure that we have the same participants in lists a and b.

We iterate through the list of participants and find the participants below participant j in list a. Then we get the participants above participant j in list b.

We then join the lists together and find which participants occur twice in the amalgamated list. The participants occurring twice are the ones who have over taken participant j and hence j's column in the sparse matrix will have 1 in place to signify that.

This process is then done for those participants above participant j and subsequently where j is to receive a 1 from that participant (i.e. the row numeral).

Any other queries feel free to ask!

Mission Accomplished.