Search code examples
pythonmatlabmatrix-multiplicationh5pymat-file

Creating a .mat file of v7.3 in python


I need to perform multiplication involving 60000X70000 matrix either in python or matlab. I have a 16GB RAM and am able to load each row of the matrix easily (which is what I require). I am able to create the matrix as a whole in python but not in matlab. Is there anyway I can save the array as .mat file of v7.3 using h5py or scipy so that I can load each row separately?


Solution

  • For MATLAB v7.3 you can use hdf5storage which requires h5py, download the file here, extract, then type: python setup.py install from a command prompt. https://pypi.python.org/pypi/hdf5storage

    import h5py
    import hdf5storage
    import numpy as np
    
    matfiledata = {} # make a dictionary to store the MAT data in
    matfiledata[u'variable1'] = np.zeros(100) # *** u prefix for variable name = unicode format, no issues thru Python 3.5; advise keeping u prefix indicator format based on feedback despite docs ***
    matfiledata[u'variable2'] = np.ones(300)
    hdf5storage.write(matfiledata, '.', 'example.mat', matlab_compatible=True)
    

    If MATLAB can't load the whole thing at once, I think you'll have to save it in different variables matfiledata[u'chunk1'] matfiledata[u'chunk2'] matfiledata[u'chunk3'] etc.

    Then in MATLAB if you save each chunk as a variable

    load(filename,'chunk1')
    do stuff...
    clear chunk1
    load(filename,'chunk2')
    do stuff...
    clear chunk2
    

    etc.

    The hdf5storage.savemat has a parameter to allow the file to be read into Python correctly in the future so worth checking out, and follows the scipy.io.loadmat format... although you can do something like this if saving data from MATLAB to make it easy to import back into Python:

    MATLAB    
    save('example.mat','-v7.3')
    Python
    matdata = hdf5storage.loadmat('example.mat')
    

    That will load back into Python as a dictionary which you can then convert into whatever datatypes you need.