Search code examples
pythonstructcythonbinaryfilesmmap

Faster way to write binary file with Python/Cython


I checked 2 ways to read a binary file using Python/Cython:

The first one was using mmap and struct.unpack module:

import mmap

import os
import struct

fd = os.open(filePath, os.O_RDONLY)
mmap_file = mmap.mmap(fd, length=24, access=mmap.ACCESS_READ, offset=0)
Xmin = struct.unpack("i", mmap_file[:4])[0]
Xmax = Xmin + struct.unpack("i", mmap_file[12:16])[0]
Ymax = struct.unpack("i", mmap_file[4:8])[0]
Ymin = Ymax - struct.unpack("i", mmap_file[16:20])[0]
Zmax = struct.unpack("1f", mmap_file[8:12])[0]

The second one was using mmap and from_buffer:

class StructHeaderLID(Structure):
    _fields_ = [('Xmin', c_int),('Ymax', c_int),('Zmax', c_float),('tileX', c_int),('tileY', c_int)]

    d_array = StructHeaderLID*1

    fd = os.open(filePath, os.O_RDWR)
    mmap_file = mmap.mmap(fd, length=24, access=mmap.ACCESS_WRITE, offset=0)
    data = d_array.from_buffer(mmap_file)
    for i in data:
        Xmin = i.Xmin
        Xmax = Xmin + i.tileX
        Ymax = i.Ymax
        Ymin = Ymax - i.tileY
        Zmax = i.Zmax

and I found out that the second one was faster.

The issue I want to solve is the fastest way to write a new binary file. I know how to write it with struct.pack:

f = open(filePath, 'wb')
line = struct.pack("i", 500000)+struct.pack("i", 4000000)
f.write(line)
f.close()

but I would like to know if there is a faster way (or something similar to mmap + from_buffer but for writing).

Thank you.

Pablo.


Solution

  • Among the fastest ways will be to use NumPy. Create an array and write it directly to the file, or use their memmap() function.