Search code examples
reverse-engineeringcrccrc32

CRC32 Parameters Reverse Engineering having access to multiple examples


I have to find out how to reproduce the CRC32 algorithm used on a proprietary database file, the file consists of many "chunks" of 128 bytes, each being a record. I know that for each record, bytes 1-4 are the CRC32 Checksum, and the next 35 bytes don't seem to matter, as I can change them easily without the application telling me the CRC Check has failed. Therefore, I am looking to find out what polynomial and other parameters are being used to calculate the latter. Below is an example.

Real Example of a record Text version:

00 27 AE 3B 9F 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 41 08 41 41 41 41 41 41 41 41 
19 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 
42 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00

If we take just the bytes we can't change, breaking the record, we get this:

41 08 41 41 41 41 41 41 41 41 19 42 42 42 42 42 42 42 42 42 42 42 42 42 42 
42 42 42 42 42 42 42 42 42 42 42 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00

the CRC32 for the above is 27 AE 3B 9F

Real Record Example 1.1, differing only by one byte from the above (CRC is BC D4 84 FB):

41 08 41 41 41 41 41 41 41 41 19 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 
42 42 42 42 42 42 42 42 42 43 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00

Real Record Example 2 (Output CRC is 3B 6A D1 AF):

41 07 41 41 41 41 41 41 41 00 19 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 
42 42 42 42 42 42 42 42 42 42 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00

Real Record Example 3 (Output CRC is 0B 54 CC 09):

41 01 31 00 00 00 00 00 00 00 03 41 73 61 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00

Real Record Example 4 (Output CRC is 12 91 EA 8E):

41 B4 A8 D0 02 46 00 B4 A8 00 03 52 4D 31 03 53 54 50 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 25 00 00 00 00 00 00 00 
00 00 A3 05 00 00 00 64 00 64 00 64 00 64 00 64 00 64 00 64 00 64 00 64 00 64 
00 64 00 64 00 64 00 64 00 64 00 64 00 64 00 64 00 64 00 64 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

Real Record Example 5 (Output CRC is 8A 68 00 3B):

41 B4 A8 D0 02 46 00 B4 A8 01 03 52 4D 31 03 53 54 50 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 FF 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 25 00 00 00 00 00 00 00 
00 00 A3 05 00 00 00 64 00 64 00 64 00 64 00 64 00 64 00 64 00 64 00 64 00 64 
00 64 00 64 00 64 00 64 00 64 00 64 00 64 00 64 00 64 00 64 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

The two last records differ on only one byte. By using the approach @rcgldr specified, I was able to get a final Xor value of 0x9902539d and I could successfully change data without the application complaining. I ran some code to find these final xor values for every entitity/file on the application and was successful on all of them, but being able to find a single crc parameter set would be a great addition.

EDIT: Added two more example records

EDIT 2: Added one more example that only differs from one byte comparing to the first one

EDIT 3: Added two more examples, with a different size, as their from another type of record within the application. Also deleted part of the question as it became irrelevant


Solution

  • xor'ing 1.0 and 1.1 results in:

    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
    00 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
    00 00 00 00 00 00 00 00 00 00
    

    xor'ing the two crcs results in

    9b 7a bf 64
    

    Assuming "little endian" for the stored crc, the calculated crc is

    0x64bf7a9b
    

    By xor'ing two records, the initial value and final xor value are canceled out due to the xor, which allows the crc polynomial to be determined based on the data alone, assuming that initial value = 0 and that final xor value = 0. Taking advantage of this, I tried some common crc polynomials and determined that the crc polynomial is

    0x104C11DB7 or ignoring the msb: 0x04C11DB7
    

    Using this web site that you linked to in your comment:

    http://www.sunshine2k.de/coding/javascript/crc/crc_js.html

    The parameters are:

    crc32
    custom
    input:  not reflected
    result: not reflected
    polynomial: 0x04C11DB7
    initial value: 0x0
    final xor value: 0x0
    

    If the data is always the same size, then either the initial value or the final xor value or a combination of both can be used to adjust the crc so that it matches the actual crc's shown in the examples, but it is simplest to use the final xor to match the examples, since it just requires calculating a crc with one of the examples, assuming initial value = 0 and final xor value = 0, then xor'ing the calculated crc with the actual crc from the example crc to calculate a final xor value for a specific length of data.

    So for the data size in the first examples, a final xor value of 0x189B52BC will produce crc's that match the examples. These are the parameters for the crc calculator.

    crc32
    custom
    input:  not reflected
    result: not reflected
    polynomial: 0x04C11DB7
    initial value: 0x0
    final xor value: 0x189B52BC
    

    These parameters match all of the first examples you posted. Again, note that the crc's are stored "little endian", most significant byte first.

    If the data size is variable, then an initial value is needed (and it's possible both an initial value and final xor value are used). Once the polynomial is known, it's possible to do a "reverse" CRC to find the initial value, or a brute force search can be used. I did a brute force search for an initial value using a fast crc calculator (since I don't have a "reverse" CRC program yet), and it appears that it will work for any data size, at least based on the new examples you added. These parameters work with all of the examples above, include the new ones you added:

    crc32
    custom
    input:  not reflected
    result: not reflected
    polynomial: 0x04C11DB7
    initial value: 0xc704dd7b
    final xor value: 0x0
    

    The initial value of 0xc704dd7b is the crc that is generated with a data pattern of {ff ff ff ff}, with initial value = 0 and final xor value = 0. It's the same as prefixing the data with {ff ff ff ff}.