Search code examples
pythonnumpymatlabbinaryfiles

Different result writing to binary file with python and matlab


I am attempting to convert a matlab script to python. The script involves writing data to a binary file. I have noticed differences between the files, specifically when writing matrices/numpy arrays. When writing other variable types (int, string etc) the files are identical as desired.

Matlab code:

fid = fopen("test2.txt", "wb");
a = [[1 2];[3 4]];
fwrite(fid, a, "float64");
fclose(fid);

Matlab result (as shown by notepad):

ð? @ @ @

matlab result

Python code:

import numpy as np
with open("test2.txt", "wb") as fid:
    a = np.array([[1, 2], [3, 4]])
    fid.write(a.astype("float64"))
    # a.astype("float64").tofile(fid) # also doesn't give correct result

Python result (as shown by notepad):

ð? @ @ @

python result


Solution

  • The characters still look very similar due to how Notepad is attempting to read integers as text, but I think it gave enough of a hint. For easier typing, let's call Matlab's text d? [@ _@ [@ and Python's text d? _@ [@ [@.

    Computer memory is linear, so all multidimensional arrays are actually stored as 1D arrays. What you're seeing is NumPy arrays being C order (by default) versus Matlab matrices being Fortran order (by default). This order is how multidimensional arrays are flattened to 1D arrays in memory.

    matrix   notepad text
    1  2     d?  _@               
    3  4     [@  [@
    
    Matlab Fortran order goes by columns
     1  3  2  4
    d? [@ _@ [@
    
    NumPy C order goes by rows
     1  2  3  4
    d? _@ [@ [@
    

    Since you're converting code between MATLAB and Python, you should be very aware of array orders being different. Iteration is faster when you don't jump around in memory, so nested for-loops may have to be reordered. It won't make much of a difference for vectorized code someScalar * myArray because it's handled for you. NumPy does provide functions and optional arguments to create Fortran order arrays numpy.asfortranarray(), ndarray.copy(order = 'F') and to check the order ndarray.flags.f_contiguous, ndarray.flags.c_contiguous, but coding with that is still tougher because C order is the default.