Search code examples
assemblyoperating-systemx86bootloaderfat

Getting next cluster number in FAT12


I am using BrokenThorn's tutorial for OS develpoment. My confusion is in this piece of code, which is responsible for reading the next cluster number of the file:

      mov     ax, WORD [cluster]  ; identify current cluster from FAT

 ; is the cluster odd or even? Just divide it by 2 and test!

      mov     cx, ax              ; copy current cluster
      mov     dx, ax              ; copy current cluster
      shr     dx, 0x0001          ; divide by two
      add     cx, dx              ; sum for (3/2)

      mov     bx, 0x0200          ; location of FAT in memory
      add     bx, cx              ; index into FAT
      mov     dx, WORD [bx]       ; read two bytes from FAT

      test    ax, 0x0001
      jnz     .ODD_CLUSTER

From my reading of online sources and threads, this is what I have found:

  1. The first cluster number in the root directory entry for the file is of 2 bytes. For FAT12, only the lower 12 bits of these 2 bytes are used.
  2. The FAT for FAT12 stores in the following format: vwX uYZ where XYZ is one cluster number and uvw is another. I have a question regarding this - which represents the lower numbered FAT entry and which represents the higher ?

However, seeing the code, I cannot understand how the above 2 facts(if assumed to be correct) are being used. Initially, ax has the 2 bytes from the root directory and its lower 12 bits can be used directly. But that is not being done. Also, how is the vwX uYZ format being parsed here ?

If someone could explain this in some detail and point out any mistakes I have made, it would be very helpful


Solution

  • The starting cluster number is used as an index into the FAT. Since it is FAT12, every 2 clusters correspond to 3 bytes.

    ax has the 2 bytes from the root directory and its lower 12 bits can be used directly. But that is not being done.

    The whole 16 bits of ax are used. Since the higher 4 bits of ax will be 0 from the starting cluster number, that is equivalent to using only the lower 12 bits (unless there is a corrupted directory entry, which could make you index into nowhere).

    Also, how is the vwX uYZ format being parsed here ?

    That is better put as vw Xu YZ. Recall that x86 is little-endian. When you read 2 bytes in x86, and they are stored as vw Xu, the actual number read is Xuvw. Mask to only keep the lower 12 bits and you get uvw. Similarly, when you read Xu YZ, the actual number read is YZXu. Shift right and you get YZX. Which, incidentally means that the actual format is likely to be vw Zu XY.