Search code examples
hadoophdfshdp

hadoop + how to rebalnce the hdfs


we have HDP cluster version 2.6.5 with 8 data nodes , all machines are installed on rhel 7.6 version

HDP cluster is based amabri platform version - 2.6.1

each data-node ( worker machine ) include two disks and each disk size is 1.8T

when we access the data-node machines we can see differences between the size of the disks

for example on the first data-node the size is : ( by df -h )

/dev/sdb                  1.8T  839G  996G  46% /grid/sdc
/dev/sda                  1.8T 1014G  821G  56% /grid/sdb

on the second data-node the size is:

/dev/sdb                  1.8T  1.5T  390G  79% /grid/sdc
/dev/sda                  1.8T  1.5T  400G  79% /grid/sdb

on the third data-node th size is:

/dev/sdb                  1.8T  1.7T  170G  91% /grid/sdc
/dev/sda                  1.8T  1.7T  169G  91% /grid/sdb

and so on

the big question is why HDFS not perform the re-balance on the HDFS disks?

for example expected results on all disks should be with the same size on all datanodes machines

why is the used size differences between datanode1 to datanode2 to datanode3 etc ?

any advice about the tune parameters in HDFS that can help us?

because its very critical when one disk is reached 100% size and the other are more small as 50%

enter image description here

enter image description here


Solution

  • This is known behaviour of the hdfs re-balancer in HDP 2.6, There are many reasons for unbalanced block distribution. Click to check all the possible reasons.

    With HDFS-1312 a disk balance option have been introduced to address this issue.

    Following articles shall help you tune it more efficiently:-

    1. HDFS Balancer (1): 100x Performance Improvement
    2. HDFS Balancer (2): Configurations & CLI Options
    3. HDFS Balancer (3): Cluster Balancing Algorithm

    I would suggest to upgrade to HDP3.X as HDP 2.x is not supported anymore by Cloudera Support.