Search code examples
pythonnumpyscipysparse-matrix

Send sparse matrix numpy through API request efficiently


I need to send some huge matrices (full of 0s) in several server to server communications through http and JSON.

I'm working with numpy and scipy in Python 3.x.

Is there any standard way to tool to do it?

I guess I could send indexes and, in some way, rebuild the matrix in the second server to get the full matrices but I would like to avoid using custom code to reinvent the wheel.

Thank you in advance.


Solution

  • The easiest approach is just pickling, but the dedicated functions are probably more efficient!

    Here some demo using python3 and scipy's dedicated save_npz function (which uses compression by default), wrapped with BytesIO (to not use files; do it in memory).

    I'm not touching the JSON-part, but this seems trivial (as we prepare a string here), especially for people doing web-stuff.

    Code:

    import io
    import scipy.sparse as sp
    
    mat = sp.random(100, 100, density=0.001)
    print(mat)
    
    # mat to serialized-string
    tmp = io.BytesIO()
    sp.save_npz(tmp, mat)
    tmp.seek(0)  # back to start
    str_ = tmp.read()
    print(str_)
    
    # serialized-string to mat
    tmp_ = io.BytesIO(str_)
    mat_loaded = sp.load_npz(tmp_)
    print(mat_loaded)
    

    Output:

    (59, 11)      0.137877385333
    (7, 36)       0.137729960685
    (94, 14)      0.0951372931412
    (3, 80)       0.235640993271
    (56, 54)      0.504472012678
    (8, 14)       0.657124520803
    (22, 92)      0.951629612278
    (81, 18)      0.733232743418
    (39, 16)      0.228000113182
    (17, 15)      0.127198226805
    b'PK\x03\x04\x14\x00\x00\x00\x08\x00\xd5}gK\xc9\xb8\xd0xH\x00\x00\x00\\\x00\x00\x00\n\x00\x00\x00format.npy\x9b\xec\x17\xea\x1b\x10\xc9\xc8\xe0\xc6P\xad\x9e\x92Z\x9c\\\xa4n\xa5\xa0n\x13j\xac\xae\xa3\xa0\x9e\x96_TR\x94\x98\x17\x9f_\x94\x92\n\x12wK\xcc)N\x05\x8a\x17g$\x16\xa4\x02\xf9\x1a\x9a:\n\xb5\n(\x80+\x99\x81\x81!\x1f\x8a\x01PK\x03\x04\x14\x00\x00\x00\x08\x00\xd5}gKR\xab(\x82I\x00\x00\x00X\x00\x00\x00\t\x00\x00\x00shape.npy\x9b\xec\x17\xea\x1b\x10\xc9\xc8\xe0\xc6P\xad\x9e\x92Z\x9c\\\xa4n\xa5\xa0n\x93i\xa2\xae\xa3\xa0\x9e\x96_TR\x94\x98\x17\x9f_\x94\x92\n\x12wK\xcc)N\x05\x8a\x17g$\x16\xa4\x02\xf9\x1aF:\x9a:\n\xb5\nH\x80+\x85\x81\x81\x01\x84\x01PK\x03\x04\x14\x00\x00\x00\x08\x00\xd5}gKy\xea\xf44\x99\x00\x00\x00\xa0\x00\x00\x00\x08\x00\x00\x00data.npy\x9b\xec\x17\xea\x1b\x10\xc9\xc8\xe0\xc6P\xad\x9e\x92Z\x9c\\\xa4n\xa5\xa0n\x93f\xa1\xae\xa3\xa0\x9e\x96_TR\x94\x98\x17\x9f_\x94\x92\n\x12wK\xcc)N\x05\x8a\x17g$\x16\xa4\x02\xf9\x1a\x86\x06:\x9a:\n\xb5\n\x08\xc0%\xb3,/\xec\xfb\xd2\x83\xf6%/2\x96)-<h\xefp\x7f\xf5\xabWQ;\xec\x0f\x88\xdc|]\xady\xce\xbe)\xb2\xadv\x91\xca\x03\xfb\x97\x8f\x8f3h\xb1?\xb5?\xdat\xe5\xe3\xfe\xe2w\xf6\xe5\x87W5/){n/\xc1|v\x92\xb4\xfeY{\tY9\x01\x0e\x8f\x03\xf6\x00PK\x03\x04\x14\x00\x00\x00\x08\x00\xd5}gK\x96\xb0\xb4\xa3]\x00\x00\x00x\x00\x00\x00\x07\x00\x00\x00col.npy\x9b\xec\x17\xea\x1b\x10\xc9\xc8\xe0\xc6P\xad\x9e\x92Z\x9c\\\xa4n\xa5\xa0n\x93i\xa2\xae\xa3\xa0\x9e\x96_TR\x94\x98\x17\x9f_\x94\x92\n\x12wK\xcc)N\x05\x8a\x17g$\x16\xa4\x02\xf9\x1a\x86\x06:\x9a:\n\xb5\n\x08\xc0\xc5\xcd\xc0\xc0\xa0\x02\xc4|@\x1c\x00\xc4fPv\x0c\x10\x0b\x01\xb1\x00\x10\xf3\x031\x00PK\x03\x04\x14\x00\x00\x00\x08\x00\xd5}gK\r\xef\xd0@_\x00\x00\x00x\x00\x00\x00\x07\x00\x00\x00row.npy\x9b\xec\x17\xea\x1b\x10\xc9\xc8\xe0\xc6P\xad\x9e\x92Z\x9c\\\xa4n\xa5\xa0n\x93i\xa2\xae\xa3\xa0\x9e\x96_TR\x94\x98\x17\x9f_\x94\x92\n\x12wK\xcc)N\x05\x8a\x17g$\x16\xa4\x02\xf9\x1a\x86\x06:\x9a:\n\xb5\n\x08\xc0e\xcd\xc0\xc0\xc0\x0e\xc4q@\xcc\x0c\xc4\x16@\xcc\x01\xc4b@\x1c\x08\xc4\xea@,\x08\xc4\x00PK\x01\x02\x14\x00\x14\x00\x00\x00\x08\x00\xd5}gK\xc9\xb8\xd0xH\x00\x00\x00\\\x00\x00\x00\n\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xb6\x81\x00\x00\x00\x00format.npyPK\x01\x02\x14\x00\x14\x00\x00\x00\x08\x00\xd5}gKR\xab(\x82I\x00\x00\x00X\x00\x00\x00\t\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xb6\x81p\x00\x00\x00shape.npyPK\x01\x02\x14\x00\x14\x00\x00\x00\x08\x00\xd5}gKy\xea\xf44\x99\x00\x00\x00\xa0\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xb6\x81\xe0\x00\x00\x00data.npyPK\x01\x02\x14\x00\x14\x00\x00\x00\x08\x00\xd5}gK\x96\xb0\xb4\xa3]\x00\x00\x00x\x00\x00\x00\x07\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xb6\x81\x9f\x01\x00\x00col.npyPK\x01\x02\x14\x00\x14\x00\x00\x00\x08\x00\xd5}gK\r\xef\xd0@_\x00\x00\x00x\x00\x00\x00\x07\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xb6\x81!\x02\x00\x00row.npyPK\x05\x06\x00\x00\x00\x00\x05\x00\x05\x00\x0f\x01\x00\x00\xa5\x02\x00\x00\x00\x00'
    (59, 11)      0.137877385333
    (7, 36)       0.137729960685
    (94, 14)      0.0951372931412
    (3, 80)       0.235640993271
    (56, 54)      0.504472012678
    (8, 14)       0.657124520803
    (22, 92)      0.951629612278
    (81, 18)      0.733232743418
    (39, 16)      0.228000113182
    (17, 15)      0.127198226805