Search code examples
unixfilesystemsinode

Understanding the concept of Inodes


I am referring to the link about concepts of Inodes

I am confused on parts:

  1. 12 direct block pointers
  2. 1 single indirect block pointer
  3. 1 double indirect block pointer
  4. 1 triple indirect block pointer

Now the diagram says that each pointer is 32/64 bits.

  • [Query]: Why and how are these values inferred? I mean why specifically have only 32 or 64 bit pointers?

The diagram says, One data block {8 KB} for each pointer {4 bytes/8 bytes}

  • [Query]: How does this actually work out? i.e. 8*1024 bytes / 8 bytes = 1024 bytes? What is the logic behind having a 8 bytes pointer for 8KB block?

Solution

  • The pointers referred to are disk block addresses - each pointer contains the information necessary to identify a block on disk. Since each disk block is at least 512 bytes (sometimes 4096 or 8192 bytes), using 32-bit addresses the disk can address up to 512 * 4 * 10243 = 2 TiB (Tebibytes - more commonly called Terabytes) assuming 1/2 KiB blocks; correspondingly larger sizes as the block size grows (so 32 TiB at 8 KiB block size). For an addressing scheme for larger disks, you would have to move to larger block sizes or larger disk addresses - hence 48-bit or 64-bit addresses might be plausible.

    So, to answer Q1, 32-bits is a common size for lots of things. Very often, when 32 bits are no longer big enough, the next sensible size is 64 bits.

    Answering Q2:

    • With 8 KiB data blocks, if the file is 96 KiB or smaller, then it uses 12 blocks or less on disk, and all those block addresses are stored directly in the inode itself.

    • When the file grows bigger, the disk driver allocates a single indirect block, and records that in the inode. When the driver needs to get a block, it reads the indirect block into memory, and then finds the address for the block it needs from the indirect block. Thus, it requires (nominally) two reads to get to the data, though of course the indirect tends to be cached in memory.

    • With an 8 KiB block size and 4-byte disk addresses, you can fit 2048 disk addresses in the single indirect block. So, for files from 96 KiB + 1 byte to 16 MiB or so, there is only a single indirect block.

    • If a file grows still bigger, then the driver allocates a double indirect block. Each pointer in the double indirect block points to a single indirect block. So, you can have 2048 more indirect blocks, each of which can effectively point at 16 MiB, leading to files of up to 32 GiB (approx) being storable.

    • If a file grows still larger, then the driver allocates a triple indirect block. Each of the 2048 pointers in a triple indirect block points to a double block. So, under the 32-bit addressing scheme with 32-bit addresses, files up to about 64 TiB could be addressed. Except that you've run out of disk addresses before that (32 TiB maximum because of the 32-bit addresses to 8 KiB blocks).

    So, the inode structure can handle files bigger than 32-bit disk addresses can handle.

    I'll leave it as an exercise for the reader to see how things change with 64-bit disk addresses.