I recently struggled to work with sparse matrices and stacking those to a single matrix. I used to create multiple csr_matrix objects
vec_list = sp.sparse.csr_matrix(my_vec_i) # every vector of shape (1,200)
And after vec_list
consisted of around 100 sparse matrices, I used scipy's (NOT numpy's) sp.vstack function to merge all 100 entries to a csr matrix of shape (100, 200).
Now in my current setting (python 3.8) I see a warning that sp.vstack is going to be deprecated, but anyways, not matter if I used numpy's or scipy's vstack functionality, I ended up having an array of shape (100,1) where my 200 columns are regarded as 1 csr_matrix entry in the first and only column.
In my old code snippets I could see, that sp.vstack(vec_list)
created a sparse crs matrix of shape (100,200)..
Do I miss anything, does anyone have thoughts on this? I am getting slightly desperate to create my stacked sparse matrix.. thanks all
Edit: As you can see below in my comment np.vstack and sp.vstack do not necessarily do the same (in my answer I sad np.vstack twice, but I meant sp.vstack once). I was using the exact solution (copied) and it returned an error at some point, as no stacking took place. In order to use sp.stacking, I stacked non-csr_matrix arrays and then convert this to a csr_matrix. This is not practicable when using huge sets of arrays, but at least I could run through the file without issues. To address the below answer from Tinu, I was not able to solve it this way, as the result looks like the following - when executing the example code:
>>> np.vstack(vec_list).shape
(100, 1)
>>> sp.vstack(vec_list).shape
(100, 200)
Python 3.8.2, Scipy 1.4.1
I was unfortunately not able to see the same result as stated above - using my Python 3.8.3rc1. Copying the code and stacking lead to the following:
>>> np.vstack(vec_list).shape # (100, 1)
>>> sp.vstack(vec_list).shape # (100, 1)
What I will do to circumvent my problem: I will stack non-csr_matrix arrays and then convert this to a csr_matrix. thanks anyways!