Search code examples
searchbinaryc64

C64 Recognize common patterns between disk images


I have 2 disk images from the C64 for example:

  • disk1 it's a game and it starts with some splash-screen and the music.
  • diks2 it's a demo and contains many songs and the same music as the first disk.

Is it any chance that examining the 2 images with some Hex editor I can see the same pattern binary sequence at some point in the 2 files? Will be they stored in the same way? And if yes what would be the right approach to match the pattern?

(the goal is to search for that song in 150k+ program files and see in what disk it is used)


Solution

  • Unfortunately the answer is that it depends. I will outline some dependencies here.

    1. Demos often contained the ripped music of games, or newer games might contain famous music of some demo. In both cases the music is the same and also the player for the music would be the same because the player was most of the time an integral part of the music itself. So in this basic case the same sequences of binary data is used and in one or another form on the disc.
    2. Demos often compressed their contents in order to load faster or create smaller programms. In that case the sequences of binary data is definitely different. You can check that if at the beginning of the demo there is some "noise" from decompressing, often lines of colours in the border, or updating characters on the screen.
    3. Some larger games also compressed the contents that was loaded, e.g. GI Joe. Then different sequences on disc. Some memory dump tools used for cracking, e.g. ISEPIC, also compressed the memory image.
    4. Some games even encrypted their contents, e.g. Bards Tale 2. Then different sequences on disc again. You can not know until you disassemble the game's loading routine.

    In case of 2 - 4 there is no hope. (I do not know the exact kind of games or demos you are looking at.)

    1. Further the disc images have a certain layout, see http://unusedino.de/ec64/technical/formats/d64.html Given this the same sequence of bytes will be distributed to different sectors of the disc. Also these sectors are not in sequence, but (not really) random. Unless the data you are looking for is smaller than a disc sector, so less than 256 bytes, the sequences will very unlikely be in the same sequence on the same tracks and sectors. So unless the files were the same and copied onto the disc in the same order, you get different sequences.

    Giving point 5 I say that very likely you will not find the sequence in the same order, even if without compression or encryption.

    You could look for chunks of sequences of 256 bytes. But the begin of the music could be different, so even the sectors might have a different data, because e.g. one sector contains the music starting at offset 0, and the other starting at offset 15.

    You need to look at the files saved on the image at least. You can parse the FAT of the disc image quite easily and find the files. A file is a series of tracks and sector numbers. You can load them into memory. Then compare these files. Here you would need to use algorithms that find parts of byte arrays inside other byte arrays because any part of the demo could be the music and it could be in any part of the game's code. Because the data is very small for modern standard, brute force might even work.

    1. Some rare games used their own disc layout and just loaded tracks and sectors, presumably to load faster or copy protection. In these cases you are lost.