Search code examples
pythonmatlabfile-ionumpymat-file

Why does saving/loading data in python take a lot more space/time than matlab?


I have some variables, which include dictionaries, list of list, and numpy arrays. I save all of them with the following code, where obj=[var1,var2,...,varn]. The variables size is small enough to be loaded in memory.

My problem is when I save the corresponding variables in matlab the output file takes a lot less space on the disk than doing it in python. Similarly, loading the variables from the disk takes a lot more time to be loaded in memory in python than matlab.

with open(filename, 'wb') as output:
    pickle.dump(obj, output, pickle.HIGHEST_PROTOCOL)

Thanks


Solution

  • Try this:

    To save to disk

    import gzip
    gz = gzip.open(filename + '.gz', 'wb')
    gz.write(pickle.dumps(obj, pickle.HIGHEST_PROTOCOL))
    gz.close()
    

    To load from disk

    import gzip
    gz = gzip.open(filename + '.gz', 'rb')
    obj = pickle.loads(gz.read())
    gz.close()