Search code examples
hadoophdfshadoop2

What is the use of VERSION file for hadoop.tmp.dir


Recently I have formatted namenode and while starting hadoop daemon datanode was failing and giving error as below

2019-01-11 10:39:15,449 WARN org.apache.hadoop.hdfs.server.common.Storage: Failed to add storage directory [DISK]file:/app/hadoop/tmp/dfs/data/ java.io.IOException: Incompatible clusterIDs in /app/hadoop/tmp/dfs/data: namenode clusterID = CID-76c39119-061a-4ecf-9de1-3a6610ca57dd; datanode clusterID = CID-90359d7f-b1a5-431e-8035-bc4b9e2ea8b9

As a resolution I have deleted tmp and created again and its working file, also copying the namenode CID to datanode CID in version file it started working.

While looking my backups I can see CID was different for datanode and namenode and that was working earlier.

Can someone provide some detail what each value represent and both CID need to be same or different?

  1. storageID=DS-7628c4d7-a508-406c-94fa-9b00f45b4f42
  2. clusterID=CID-73f3a584-4a8a-4260-856c-5a2062b6ae61
  3. cTime=0
  4. datanodeUuid=40834363-2025-4e9a-bb1e-e489bf13cad9
  5. storageType=DATA_NODE
  6. layoutVersion=-57

  7. namespaceID=1181871748

  8. clusterID=CID-73f3a584-4a8a-4260-856c-5a2062b6ae61
  9. cTime=1547187830726
  10. storageType=NAME_NODE
  11. blockpoolID=BP-2120424576-127.0.1.1-1547187830726
  12. layoutVersion=-63

Solution

  • The Cluster-ID must be same in the datanode as well as namenode Or alternatively delete the <dfs.datanode.data.dir> / directory and <dfs.namenode.name.dir> / directories and format the name node to start hdfs cluster with fresh copy of cluster ID. You can refer hadoop tutorial site for more detail