Search code examples
hbasedata-recoverywal

HBase BulkLoad data recovery


As Bulkload method entirely bypasses write path, the WAL doesn’t get written to as part of the process so How bulkloaded data will be recovered in case of region server failure/crash?


Solution

  • HBase store data in HFiles, which is immutable and placed in HDFS, which is already reliable storage. Usually, the minimal size of such files is about 128mb. Before the creation of HFile HBase accumulates needed an amount of data in memory, to achieve a durability during this process, HBase uses WAL. In the case, in the bulk load, you don't need a WAL because bulk load operation creates HFiles directly and when tell HBase to use them as part of data storage.