Search code examples
hadoophdfsorc

Can a HDFS Block of 128 MB store two different ORC files of size 1MB each?


I'm working on storage aspect of Hadoop and exploring on know how ORC files get stored on HDFS block.


Solution

  • In HDFS, a file is composed of blocks. One block cannot hold multiple files.

    Two ORC files, each with 1MB, will need a block per file.

    If you are concerned about the actual disk storage it might consume, it will be 2MB only. Though the blocks are 128MB, the disk storage it determined by the size of the actual file/block.