Search code examples
hadoopblockhdfs

File blocks on HDFS


Does Hadoop guarantee that different blocks from same file will be stored on different machines in the cluster? Obviously replicated blocks will be on different machines.


Solution

  • No. If you look at the HDFS Architecture Guide, you'll see (in the diagram) that file part-1 has a replication factor of 3, and is made up of three blocks labelled 2, 4, and 5. Note how blocks 2 and 5 are on the same Datanode in one case.