Is there a way to convert from a pandas.SparseDataFrame
to scipy.sparse.csr_matrix
, without generating a dense matrix in memory?
scipy.sparse.csr_matrix(df.values)
doesn't work as it generates a dense matrix which is cast to the csr_matrix
.
Thanks in advance!
Pandas docs talks about an experimental conversion to scipy sparse, SparseSeries.to_coo:
http://pandas-docs.github.io/pandas-docs-travis/sparse.html#interaction-with-scipy-sparse
================
edit - this is a special function from a multiindex, not a data frame. See the other answers for that. Note the difference in dates.
============
As of 0.20.0, there is a sdf.to_coo()
and a multiindex ss.to_coo()
. Since a sparse matrix is inherently 2d, it makes sense to require multiindex for the (effectively) 1d dataseries. While the dataframe can represent a table or 2d array.
When I first responded to this question this sparse dataframe/series feature was experimental (june 2015).