Search code examples
dockerhadoophdfs

incompatible cluster id between namenode and datanode for hadoop


On Windows 11, I installed the latest available version of Docker Desktop. Following that, I visited the official Apache Hadoop GitHub repository at https://github.com/apache/hadoop/tree/docker-hadoop-3. Subsequently, I downloaded the necessary files and executed the following commands:

docker-compose build

docker-compose up -d

This successfully built and started the Hadoop cluster. However, upon stopping the containers via Docker Desktop and attempting to start them again, the datanode failed to start due to an incompatible clusterid with the namenode. While I could resolve this issue by deleting the rm -rf /tmp/hadoop-hadoop/dfs/data/ directory on the datanode and restarting the datanode, this approach removes all data, which is not the desired outcome.

Attempts to manually copy the clusterid from the namenode to the datanode I could do, but still left the problem unresolved as it appeared the namenode was trying re-format the datanode but did not have permissions to do so. Additionally, I observed that both the namenode and datanode appear to reformat on every startup, a process that ideally should occur only once and then subsequent starts ups should not re-format.

Is there a solution to address this issue without resorting to deleting the data directory on the datanode, ensuring persistent data, and allowing the datanode to run seamlessly?


Solution

  • Apache/Hadoop image is only for testing purposes, and appears cannot be modified. If you would like to use apache hadoop then download it onto a linux image then go from there using a dockerfile, etc.