Search code examples
matlabiobinaryjulia

Different outputs when reading binary file in Matlab and Julia


I am trying to read a binary file in Julia. The file contains information about positions in 3D space. I am basing my code off of a matlab script, yet when trying to reproduce the read, Julia gives me a different result. I've read through Julia's documentation, yet I cannot understand where this difference is coming from.

Here's the matlab code:

file = 'lh.white';
fid = fopen(file, 'rb', 'b');
b1 = fread(fid, 1, 'uchar');
b2 = fread(fid, 1, 'uchar');
b3 = fread(fid, 1, 'uchar');   
fgets(fid);
fgets(fid);
vnum = fread(fid, 1, 'int32');

and the Julia analog:

file = "lh.white"
fid = open(file)
signature = read(fid, 3)  # equivalent of 3 'uchar' from matlab
readline(fid, keep=true) # ok
readline(fid, keep=true) # ok
vnum =read(fid, Int32)  # different results.

Would anyone know what could be happening here, and why vnum in Julia is a large negative number that differs from the output of matlab?

I've uploaded a dummy dataset here, if that turns out to be useful: https://filebin.net/k8sv7d8ztmucqell


Solution

  • The issue is most likely related to the endianness of the data you are processing. In MATLAB, opening the file for fread()ing in big-endian byte order (docs), i.e.

    fid = fopen(file, 'rb', 'b');
    %fread()s, fgets()es
    vnum = fread(fid, 1, 'int32');
    

    produces vnum = 179589, whereas using the native machine format (little-endian mode, 32-bit words),

    fid = fopen(file, 'rb');
    %fread()s, fgets()es
    vnum = fread(fid, 1, 'int32');
    

    produces vnum = -2051210752. I have not run the Julia code you provided, but apparently it calls read() without any endianness conversion (see https://docs.julialang.org/en/v1/base/io-network/#Base.read), so I would expect it to return -2051210752 as well.

    (The integer we are attempting to read here is 0x0002bd85; see the hexdump of the lh.white you provided:

    $ xxd lh.white
    00000000: ffff fe63 7265 6174 6564 2062 7920 7465  ...created by te
    00000010: 7374 0a0a 0002 bd85 0005 7b06 c189 b752  st........{....R
    

    Some online calculators like this one can show the result of reading this 4-byte sequence in different orders.)