Search code examples
pythonfilebyte

How to use magic bytes to identify files using python


I was given a problem that stated:

We've extracted one of the alien zip files, it's a bunch of PNG files, but we think only one of them is valid. Use magic byte to determine which it is. Tip: Find and read the correct file to get the flag.

All the png files are stored in the /tmp directory. After a couple of attempts at the problem I have only gotten so far. My code runs fine but prints no for every file with none of them being the correct one according to my code.

Here's my code so far:

import glob,os

magic_numbers = {'.png': bytes([0x89, 0x50, 0x4E, 0x47, 0x0D, 0x0A, 0x1A, 0x0A])}
max_read_size = max(len(m) for m in magic_numbers.values()) # get max size of magic numbers of the dict
os.chdir("/tmp")
for x in glob.glob("*.png"):
    with open(x, 'rb') as fd:
            file_head = fd.read(max_read_size)

    if file_head.startswith(magic_numbers['.png']):
            print("It's a PNG File")
    else:
            print("no")

Clearly I am doing something wrong but I cannot figure out what it is. Is it a problem with the loop? How am I supposed use magic bytes to identify files?


Solution

  • Your code needed a bit of tweaking, by prefixing the word png with a dot, you're making it seem as if the file has an extension. Also, print file_head. Run this:

    import glob, os
    
    magic_numbers = {'png': bytes([0x89, 0x50, 0x4E, 0x47, 0x0D, 0x0A, 0x1A, 0x0A])}
    max_read_size = max(len(m) for m in magic_numbers.values()) # get max size of magic numbers of the dict
    os.chdir("/tmp")
    
    for x in glob.glob("*png"):
        with open(x, 'rb') as fd:
            file_head = fd.read()
            print(file_head)
    
        if file_head.startswith(magic_numbers['png']):
            print("It's a PNG File")
        else:
            print("no")
    

    It should print something like this:

    b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x00\x01\x00\x00\x00\x01\x08\x06\x00\x00\x00\x1f\x15\xc4\x89\x00\x00\x00\nIDATx\x9cc\x00\x01\x00\x00\x05\x00\x01\r\n-\xb4\x00\x00\x00\x00IEND\xaeB`\x82The flag is: 2NECGNQM4GD3QPD' It's a PNG File
    

    Your flag will probably be different from mine.

    Cheers!