Search code examples
cbit-shiftfat

converting little endian hex to big endian decimal in C


I am trying to understand and implement a simple file system based on FAT12. I am currently looking at the following snippet of code and its driving me crazy:

int getTotalSize(char * mmap) { int *tmp1 = malloc(sizeof(int)); int *tmp2 = malloc(sizeof(int)); int retVal;

* tmp1 = mmap[19];
* tmp2 = mmap[20];
printf("%d and %d read\n",*tmp1,*tmp2);
retVal = *tmp1+((*tmp2)<<8);
free(tmp1);
free(tmp2);
return retVal;

};

From what I've read so far, the FAT12 format stores the integers in little endian format. and the code above is getting the size of the file system which is stored in the 19th and 20th byte of boot sector.

however I don't understand why

  retVal = *tmp1+((*tmp2)<<8); 
works. is the bitwise <<8 converting the second byte to decimal? or to big endian format? why is it only doing it to the second byte and not the first one?

the bytes in question are [in little endian format] :

40 0B

and i tried converting them manually by switching the order first to

0B 40

and then converting from hex to decimal, and I get the right output, I just don't understand how adding the first byte to the bitwise shift of second byte does the same thing? Thanks


Solution

  • The use of malloc() here is seriously facepalm-inducing. Utterly unnecessary, and a serious "code smell" (makes me doubt the overall quality of the code). Also, mmap clearly should be unsigned char (or, even better, uint8_t).

    That said, the code you're asking about is pretty straight-forward.

    Given two byte-sized values a and b, there are two ways of combining them into a 16-bit value (which is what the code is doing): you can either consider a to be the least-significant byte, or b.

    Using boxes, the 16-bit value can look either like this:

    +---+---+
    | a | b |
    +---+---+
    

    or like this, if you instead consider b to be the most significant byte:

    +---+---+
    | b | a |
    +---+---+
    

    The way to combine the lsb and the msb into 16-bit value is simply:

    result = (msb * 256) + lsb;
    

    UPDATE: The 256 comes from the fact that that's the "worth" of each successively more significant byte in a multibyte number. Compare it to the role of 10 in a decimal number (to combine two single-digit decimal numbers c and d you would use result = 10 * c + d).

    Consider msb = 0x01 and lsb = 0x00, then the above would be:

    result = 0x1 * 256 + 0 = 256 = 0x0100
    

    You can see that the msb byte ended up in the upper part of the 16-bit value, just as expected.

    Your code is using << 8 to do bitwise shifting to the left, which is the same as multiplying by 28, i.e. 256.

    Note that result above is a value, i.e. not a byte buffer in memory, so its endianness doesn't matter.