What is the use of VERSION file for hadoop.tmp.dir

Recently I have formatted namenode and while starting hadoop daemon datanode was failing and giving error as below

2019-01-11 10:39:15,449 WARN org.apache.hadoop.hdfs.server.common.Storage: Failed to add storage directory [DISK]file:/app/hadoop/tmp/dfs/data/ java.io.IOException: Incompatible clusterIDs in /app/hadoop/tmp/dfs/data: namenode clusterID = CID-76c39119-061a-4ecf-9de1-3a6610ca57dd; datanode clusterID = CID-90359d7f-b1a5-431e-8035-bc4b9e2ea8b9

As a resolution I have deleted tmp and created again and its working file, also copying the namenode CID to datanode CID in version file it started working.

While looking my backups I can see CID was different for datanode and namenode and that was working earlier.

Can someone provide some detail what each value represent and both CID need to be same or different?

storageID=DS-7628c4d7-a508-406c-94fa-9b00f45b4f42
clusterID=CID-73f3a584-4a8a-4260-856c-5a2062b6ae61
cTime=0
datanodeUuid=40834363-2025-4e9a-bb1e-e489bf13cad9
storageType=DATA_NODE
layoutVersion=-57
namespaceID=1181871748
clusterID=CID-73f3a584-4a8a-4260-856c-5a2062b6ae61
cTime=1547187830726
storageType=NAME_NODE
blockpoolID=BP-2120424576-127.0.1.1-1547187830726
layoutVersion=-63

Solution

The Cluster-ID must be same in the datanode as well as namenode Or alternatively delete the <dfs.datanode.data.dir> / directory and <dfs.namenode.name.dir> / directories and format the name node to start hdfs cluster with fresh copy of cluster ID. You can refer hadoop tutorial site for more detail