Search code examples
hbasebigdatadatabasenosql

Is there a way to do hard delete (real delete) on HBase?


I know hbase never do real delete on records and it just set a tombstone marker. But what if the amount of data is getting bigger and bigger, and some day you want to reduce the size by conducting a hard delete (real delete) on some of the selected rows?


Solution

  • Delete markers and deleted cells are removed during major compaction. Minor compaction only merges small HFiles into bigger. You can trigger major compaction manually by using the following command:

    major_compact "table name"
    

    Compaction (minor and major) is an online operation. There is no need of maintenance window to perform compaction.

    Keep in mind that major compaction might take long time since it will reorganize all the HFiles. To avoid negative performance impact for heavily loaded systems, you might consider scheduling compaction outside peak hours.

    Major compaction happens also automatically (by default every 7 days). The frequency of scheduled major compaction is controlled through the hbase.hregion.majorcompaction parameter.

    Minor compaction can also escalate to major.

    For further details, I suggest the excellent HBase Reference Guide.