Shape inconsistence in my code based on numpy/scipy package

I have 2 matrices and a vector which I multiply by using the dot() function of numpy.

print D.shape, A.shape, y.shape, type(D), type(A), type(y)
# (236, 236) (236, 236) (236,)
# <class 'scipy.sparse.csr.csr_matrix'>
# <class 'scipy.sparse.csr.csr_matrix'>
# <type 'numpy.ndarray'>

y_next = np.dot(D, np.dot(A, y))

print y_next.shape
# (236,)

So if the shape of y_next is (236,) it means it is a 236x1-matrix, correct? Now if I do print y_next I get the output below. I just copied the last cupple of rows, but you can see, that the first index (row) is not unique. Where does this come from? I mean a matrix multiplyed by a vector should result in a vector, and a vector can't have the same indices twice by definition. If it has, as the ouput suggests, it would be a matrix.

Where is my mistake?

Output:

::  
(230, 212)  0.04
  (230, 205)    0.04
  (230, 187)    0.04
  (230, 11) 0.04
  (231, 230)    0.04
  (231, 212)    0.04
  (231, 205)    0.04
  (231, 187)    0.04
  (231, 11) 0.04
  (232, 235)    0.0625
  (232, 234)    0.0625
  (232, 233)    0.0625
  (232, 160)    0.0625
  (233, 235)    0.0625
  (233, 234)    0.0625
  (233, 232)    0.0625
  (233, 160)    0.0625
  (234, 235)    0.0625
  (234, 233)    0.0625
  (234, 232)    0.0625
  (234, 160)    0.0625
  (235, 234)    0.0625
  (235, 233)    0.0625
  (235, 232)    0.0625
  (235, 160)    0.0625
   (0, 79)  0.0555555555556
  (0, 3)    0.0555555555556
  (0, 2)    0.0555555555556
  (0, 1)    0.0833333333333
  (1, 80)   0.0555555555556
  (1, 3)    0.0555555555556
  (1, 2)    0.0555555555556
  (1, 0)    0.0833333333333
  (2, 81)   0.00966183574879
  (2, 8)    0.00966183574879
  (2, 7)    0.00966183574879
  (2, 6)    0.00966183574879
  (2, 5)    0.00966183574879
  (2, 4)    0.00966183574879
  (2, 3)    0.0338164251208
  (2, 1)    0.00966183574879
  (2, 0)    0.00966183574879
  (3, 82)   0.00966183574879
  (3, 8)    0.00966183574879
  (3, 7)    0.00966183574879
  (3, 6)    0.00966183574879
  (3, 5)    0.00966183574879
  (3, 4)    0.00966183574879
  (3, 2)    0.0338164251208
  (3, 1)    0.00966183574879
  : :
  (230, 212)    0.04
  (230, 205)    0.04
  (230, 187)    0.04
  (230, 11) 0.04
  (231, 230)    0.04
  (231, 212)    0.04
  (231, 205)    0.04
  (231, 187)    0.04
  (231, 11) 0.04
  (232, 235)    0.0625
  (232, 234)    0.0625
  (232, 233)    0.0625
  (232, 160)    0.0625
  (233, 235)    0.0625
  (233, 234)    0.0625
  (233, 232)    0.0625
  (233, 160)    0.0625
  (234, 235)    0.0625
  (234, 233)    0.0625
  (234, 232)    0.0625
  (234, 160)    0.0625
  (235, 234)    0.0625
  (235, 233)    0.0625
  (235, 232)    0.0625
  (235, 160)    0.0625]

Solution

The source of your confusion is the use of the numpy dot operator with scipy sparse matrices. For numpy and scipy matrices (note matrices, not arrays), the * operator computes dot products, like this:

In [47]: import scipy.sparse as sp

In [48]: import numpy as np

In [49]: D=sp.csr.csr_matrix(np.diagflat(np.random.random(100)))

In [50]: A=sp.csr.csr_matrix(np.diagflat(np.random.random(100)))

In [51]: y=np.random.random(100)

In [52]: y_next = A*(D*y)

In [53]: print y_next.shape, type(y_next)
(100,) <type 'numpy.ndarray'>

In [54]: print y_next
[ 0.00478446  0.0234117   0.02234696  0.23123913  0.15545059  0.366065
  0.05674736  0.00238582  0.08701694  0.00099934  0.01687756  0.08190578
  0.17570485  0.08015175  0.00301985  0.00491663  0.09450794  0.1141585
  0.02753342  0.0462671   0.02075956  0.21261696  0.82611774  0.09058998
  0.33545702  0.31456356  0.00260624  0.0449429   0.2431993   0.06302444
  0.01901411  0.02553964  0.02442291  0.02169692  0.15085474  0.41331208
  0.09486585  0.01001604  0.48898697  0.03557272  0.22931588  0.0760863
  0.37686888  0.02801424  0.3280943   0.1695001   0.02890001  0.11712331
  0.02996858  0.43608624  0.00905409  0.00655408  0.01618681  0.1417559
  0.0057121   0.0010656   0.02067559  0.05223334  0.14035328  0.0457123
  0.1273495   0.17688214  0.39300249  0.00625762  0.05356745  0.26719959
  0.08349373  0.05969248  0.02332782  0.0218782   0.1716797   0.04823102
  0.03117486  0.00172426  0.08514879  0.09505655  0.17030885  0.00953221
  0.00134071  0.03951708  0.00243708  0.04247436  0.32152315  0.02039932
  0.00436897  0.00097858  0.08876351  0.00824626  0.12004067  0.01060241
  0.11929884  0.01207807  0.10467955  0.02536641  0.602902    0.04115373
  0.00472405  0.05108167  0.28946041  0.19071962]

Alternatively, the sparse matrix classes own dot method will also work:

In [55]: print D.dot(A.dot(y)) - (D*(A*y))

[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.
  0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]

Using numpy.dot seems to result in what is internally a sparse matrix with an ndarray type. The repeated entries you are seeing are the individual products which get summed down into final dot products, I think.