Search code examples
pythonscipymat

How to load .mat files in Python, and access columns individually?


I can load the .mat file, but as columns have no names, I don't know how to reference them.

The .mat file consists of four columns, with a lot of rows.

import numpy as np 
import pandas as pd
from scipy.io import loadmat
from sklearn.preprocessing import PolynomialFeatures 


data = loadmat('data.mat')
data.keys()

This results in: data['no names for columns in mat file']

What is wrong with this code?


Solution

  • When I load a test mat I get a display like

    In [50]: data=loadmat('test7.mat')
    In [51]: print(data)
    {'__globals__': [], 'x': array([[ 1.,  2.,  3.],
           [ 4.,  5.,  6.]]), '__version__': '1.0', '__header__': b'MATLAB 5.0 MAT-file, written by Octave 4.0.0, 2016-09-01 15:43:02 UTC'}
    

    That tells me the mat contains a variable called x, which I can access with:

    In [52]: data['x']
    Out[52]: 
    array([[ 1.,  2.,  3.],
           [ 4.,  5.,  6.]])
    

    We need similar kind of information for file in order to help.

    In [53]: list(data.keys())   # list need in Py3
    Out[53]: ['__globals__', 'x', '__version__', '__header__']
    

    I'm not quite sure what you mean by columns and names in MATLAB context. Is the item(s) in the file expected to be MATLAB matrices, cells and/or structs?

    Column names is a pandas concept, not numpy or MATLAB (that I know of).