Search code examples
pythonexifjfif

How can I detect JPEG files by their EXIF / JFIF signature?


few days ago I asked a question in a different field and finally a friend(@emcconville) helped me with a script for "Recover every JPEG files in a single file" . Now I realized that this program only works on images with the standard "JFIF" and is not capable of retrieving images with "EXIF" standard (Images taken by digital cameras).

How to change the program so that it can also know the Exif standard in Images? I'm not familiar with Python, and I do not know the power of that.

Thanks

import struct

with open('src.bin', 'rb') as f:
    # Calculate file size.
    f.seek(0, 2)
    total_bytes = f.tell()
    # Rewind to beging.
    f.seek(0)
    file_cursor = f.tell()
    image_cursor = 0

    while file_cursor < total_bytes:
        # Can for start of JPEG.
        if f.read(1) == b"\xFF":
            if f.read(3) == b"\xD8\xFF\xE0":
                print("JPEG FOUND!")
                # Backup and find the size of the image
                f.seek(-8, 1)
                payload_size = struct.unpack('<I', f.read(4))[0]
                # Write image to disk
                d_filename = 'image{0}.jpeg'.format(image_cursor)
                with open(d_filename, 'wb') as d:
                    d.write(f.read(payload_size))
                image_cursor += 1
        file_cursor = f.tell()

Solution

  • EXIF files have a marker of 0xffe1, JFIF files have a marker of 0xffe0. So all code that relies on 0xffe0 to detect a JPEG file will miss all EXIF files. (from here)

    So just change

    if f.read(3) == b"\xD8\xFF\xE0":
    

    to

    if f.read(3) == b"\xD8\xFF\xE1":
    

    If you want to check for both cases, do not use .read() like that anymore. Instead something like

    x = f.read(3)
    if x in (b"\xD8\xFF\xE0", b"\xD8\xFF\xE1"):