Search code examples
hadoophdfscloudera

HDFS Configured Capacity higher than disk capacity


I have an 11-node cluster with Cloudera Express 5.11 on Centos. Originally it was made of 7 nodes only; 4 more nodes have been added at a later time. Disk capacity is the same in every node: 5.4 TB.

The problem I'm having is that the hdfs dfsadmin -report command is showing wrong values of disk usage, especially for the Configured Capacity. The values I have are 6.34 TB in the first 7 nodes and 21.39 TB in the last 4 ones.

For example, in one node I have the following report:

Decommission Status : Normal
Configured Capacity: 23515321991168 (21.39 TB)
DFS Used: 4362808995840 (3.97 TB)
Non DFS Used: 14117607018496 (12.84 TB)
DFS Remaining: 3838187159552 (3.49 TB)
DFS Used%: 18.55%
DFS Remaining%: 16.32%
Configured Cache Capacity: 2465202176 (2.30 GB)
Cache Used: 0 (0 B)
Cache Remaining: 2465202176 (2.30 GB)
Cache Used%: 0.00%
Cache Remaining%: 100.00%

Running the df command on the dfs.data.dir folders showed me that the DFS Used value (not the percentage) is correct, but the others are way off. I have read that HDFS may show values there are not up-to-date, but I've been seeing the same values for some days, even after rebooting all services and all machines.

What bugs me the most is that:

  1. The Configured Capacity is way higher than the true capacity (how could it infer 21 TB when I have only 5 TB?)
  2. I have two different values for the two sets of nodes, respectively

What could be the causes for these values? And is there a way to fix them?

PS: the reason I'm asking this is that, with the wrong values, HDFS underestimates the DFS Used% and thus fails to rebalance files in the nodes. Indeed, the node for which I posted the valued has:

  • DFS Used: ~4 TB (correct)
  • DFS Used%: ~19% (wrong)

Every other node has:

  • DFS Used: ~2 TB (correct)
  • DFS Used%: ranging from 11% to 28% (wrong)

This makes it so that the DFS Used% of the incriminated node is under the average, thus the balancer of HDFS infers that the node should not be rebalanced.

PS2: one thing I have noticed is that the first set of nodes has Centos 6.9, while the second one has Centos 6.8. Could this contribute somehow to the problem?


Solution

  • Update

    After a year and a half, I found the actual source of the problem.

    The reason is that I have several directories listed in the dfs.datanode.data.dir parameter of HDFS. Apparently, HDFS estimates the Configured capacity by summing up the capacity of every directory. The problem is: if two directories are in the same partition, the size of that partition will be considered twice! Weirdly enough, I haven't found any mention of this in the documentation.

    This was giving me problems because in the first group of machines had 4 HDFS directories assigned to 3 partitions of ~1.8T each (thus only one of them was considered twice), while the second group had 4 HDFS directories assigned to 1 partition of ~5.4TB (which was thus multiplied by 4!).

    Ultimately, the problem is the results of a heterogeneous partition configuration of the machines + some low-level detail of HDFS not properly documented.

    I ended up creating two sets of HDFS directory configurations in Cloudera: one for the first group of machines (with 3 directories, one per partition) and one for the second group (with one directory in the only partition). Be careful with this operation as data rebalancing involved.

    Original answer

    What could be the causes for these values?

    After some research, it seems that this issue happened when the cluster was updated with new resources (i.e., new disks or new nodes), as HDFS updated the Configured Capacity of the involved Datanodes with the total capacity of all the involved Datanodes (i.e., when we upgraded the disk of the first 7 nodes, each node's capacity became the total capacity of the cluster; when we added 4 more nodes, the capacity of each new node became the total capacity of the new nodes). Could this be due to the Cloudera Manager? Possibly (that's my guess), but I don't have proof.

    And is there a way to fix them?

    I've read Hadoop's Java code to understand where the value of nodes' Configured Capacity is taken from, and it appears that it comes from the Namenode's namespace image (which is a binary file and, AFAIK, it is not editable).

    What I ended up doing is decommissioning the unbalanced node (which triggered the replication of its block on the remaining nodes), deleting HDFS data on such node, recommissioning it and rebalancing the data. It is not the solution I was looking for, but at least it got my data correctly rebalanced.