Search code examples
pythonmatrixsparse-matrix

How to create huge sparse matrix with dtype=float16?


I've tried all of these and had either memory error or some kind of other error.

Matrix1 = csc_matrix((130000,130000)).todense()

Matrix1 = csc_matrix((130000,130000), dtype=float_).todense()

Matrix1 = csc_matrix((130000,130000), dtype=float16).todense()

How can I create a huge sparse matrix with float type of data?


Solution

  • To create a huge sparse matrix, just do exactly what you're doing:

    Matrix1 = csc_matrix((130000,130000), dtype=float16)
    

    … without calling todense() at the end. This succeeds, and takes a tiny amount of memory.1

    When you add todense(), that successfully creates a huge sparse array that takes a tiny amount of memory, and then tries to convert that to a dense array that takes a huge amount of memory, which fails with a MemoryError. But the solution to that is just… don't do that.

    And likewise, if you use dtype=float_ instead of dtype=float16, you get float64 values (which aren't what you want, and take 4x the memory), but again, the solution is just… don't do that.


    1. sys.getsizeof(m) gives 56 bytes for the sparse array handle, sys.getsizeof(m.data) gives 96 bytes for the internal storage handle, and m.data.nbytes gives 0 bytes for the actual storage, for a grand total of 152 bytes. Which is unlikely to raise a MemoryError.