Search code examples
pythonfilebinaryfile-manipulation

File manipulation in binary


Is it possible to convert a file text/image/mp3 to just the binary code thats its made up of for it then to be manipulated for example in python or whatever language. I poked around a bit online and Binary files were mentioned a lot but nothing was particularly useful or coherent. Thanks for any info, i've done a fair bit of high level programming so now am looking to branch out a bit.


Solution

  • If you want to manipulate binary files, use the rb (read binary) and wb (write binary) file modes:

    with open('binary_file.mp3', 'rb') as f:
        first_byte = f.read(1)
    

    To be clear, all files are binary. Some binary files can be interpreted as text files, but they're still stored in binary underneath. Think of it this way, a file is a series of numbers, the numbers can only be in the range 0 to 255. Then in the 60's and 70's some Americans decided that if you see the number 65, it's actually the capital letter "A", then 66 is "B" etc. Then 97 is lower case "a" 98 is "b" etc. and we would never use numbers greater than 127. You could come up with your own mapping of numbers to letters (and other people in different countries did) but you should probably use the mapping people have more or less all agreed on using, which is called ASCII (and its extension, UTF-8). If you want to look at the actual numbers under the hood of a file, you need a hex editor. But they represent numbers not like we are used to.

    If you want to see what the actual ones and zeros of a file are just use this (the := operator requires Python 3.8+)

    with open('binary_file_name', 'rb') as f:
        while byte := f.read(1):
            print(f'{ord(byte):08b}')