Search code examples
pythonpandasdataframesparse-matrix

AttributeError: 'DataFrame' object has no attribute 'to_sparse'


sdf = df.to_sparse() has been deprecated. What's the updated way to convert to a sparse DataFrame?


Solution

  • These are the updated sparse conversions in pandas 1.0.0+.


    How to convert dense to sparse

    Use DataFrame.astype() with the appropriate SparseDtype() (e.g., int):

    >>> df = pd.DataFrame({'A': [1, 0, 0, 0, 1, 0]})
    >>> df.dtypes
    # A    int64
    # dtype: object
    
    >>> sdf = df.astype(pd.SparseDtype(int, fill_value=0))
    >>> sdf.dtypes
    # A    Sparse[int64, 0]
    # dtype: object
    

    Or use the string alias for brevity:

    >>> sdf = df.astype('Sparse[int64, 0]')
    

    How to convert sparse to dense

    Use DataFrame.sparse.to_dense():

    >>> from scipy import sparse
    >>> sdf = pd.DataFrame.sparse.from_spmatrix(sparse.eye(3), columns=list('ABC'))
    >>> sdf.dtypes
    # A    Sparse[float64, 0]
    # B    Sparse[float64, 0]
    # C    Sparse[float64, 0]
    # dtype: object
    
    >>> df = sdf.sparse.to_dense()
    >>> df.dtypes
    # A    float64
    # B    float64
    # C    float64
    # dtype: object
    

    How to convert sparse to COO

    Use DataFrame.sparse.to_coo():

    >>> from scipy import sparse
    >>> sdf = pd.DataFrame.sparse.from_spmatrix(sparse.eye(3), columns=list('ABC'))
    >>> sdf.dtypes
    # A    Sparse[float64, 0]
    # B    Sparse[float64, 0]
    # C    Sparse[float64, 0]
    # dtype: object
    
    >>> df = sdf.sparse.to_coo()
    # <3x3 sparse matrix of type '<class 'numpy.float64'>'
    #         with 3 stored elements in COOrdinate format>
    # (0, 0)    1.0
    # (1, 1)    1.0
    # (2, 2)    1.0