Search code examples
cbinaryfileschecksumcode-assist

C - reading binary file, validating checksum


In C, what would be a proper way to read the contents of a binary file and validate the checksum?

Here is a sample of the data I'm working with:

0A 01 17 D8 04 00 07 9A 1F 10 FF CF 7F FF FF FF
FF 7F 7F 7F FF 7F FF FF FF FF 7F 81 01 01 03 01
01 01 01 81 00 73 67 68 66 97 6C 76 64 64 6A 6B
6E 64 66 67 44 41 [17 7A]

and

00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 44 41 [00 85]

The checksum (between square brackets) stores the sum of the previous 54 bytes as a 2-byte (big-endian) number.

This is what I've been using:

#include <stdio.h>
int main(int argc, char **argv)
{
    FILE *f = fopen(argv[1], "rb");

    unsigned char data[512];
    fread(data, 1, 512, f);

    int i;
    int sum = 0;
    for (i = 0; i < 54; i++)
    {
        sum += data[i];
    }

    if ((sum >> 8)   == data[54] &&
        (sum & 0xFF) == data[55])
    {
        printf("Checksum is valid.\n");
    }
    else
    {
        printf("Checksum is invalid.\n");
    }
    system("pause");
}

I have used a char array to store the bytes, using the indexes to recalculate the checksum in a loop. To validate I used some bitwise shifting and masking. Is there a better solution?

Thanks!


Solution

  • What you've written is reasonable, though if there should only be 56 bytes, there's no real point in trying to read 512 bytes. Also, rather than repeating 512, use sizeof(data) for the second appearance. You might package the code as a function, and you might print out the actual and the expected checksums when there's a mismatch. The checksum algorithm is not very sensitive, so it will easily miss some mistakes, such as transposition errors, though it will catch a fair number too.


    Hmmm...looking again...You've written:

    if ((sum >> 8)   == data[54] &&
        (sum & 0xFF) == data[55])
    

    Since sum is a (signed) int, you should probably be writing:

    if ((sum >> 8) & 0xFF == data[54] &&
        (sum & 0xFF)      == data[55])
    

    Otherwise, you might get overflows. Maybe not with just 54 bytes of data, but if the data to be checksummed was sufficiently long (more than 256 bytes for sure), you could end up with a sum bigger than 65535, and the comparison would fail when it shouldn't. I'm assuming sizeof(int) == 4, and not sizeof(int) == 2.