I would like to reproduce the SVD method mentioned in a standford lecture on my own dataset. The slide of the lecture is as following
My dataset is of the same type, which is a word co-occurrence matrix M with a size of
<13840x13840 sparse matrix of type '<type 'numpy.int64'>'
with 597828 stored elements in Compressed Sparse Column format>
generated and processed from CountVectorizer(), note that this is a symmetric matrix.
However, when I tried to extract features from SVD, however, none of the following code works,
scipy.linalg.svd(M)
I have tried the matrix from sparse csr todense() and toarray(), my computer taken quite a few minutes, and it displays kernel stops. I also played around with different parameter settings
scipy.sparse.linalg.svds(M)
I have also tried to change the matrix type from int64 to float64, however, the kernel dead after 30 seconds or so.
Anyone could suggest me a way to conduct SVD on this matrix in any way?
Thank you so much
Seems that the matrix is to stressful for the memory. You have several options:
The latter two should work out of the box. All these options will load only what the memory can.