Search code examples
hadoopreplicationhdfs

HDFS Reduced Replication Factor


I've reduced the replication factor from 3 to 1, yet do not see any activity from the namenode or between datanodes to remove overly-replicated HDFS file blocks. Is there a way to monitor or force the replication job?


Solution

  • Changing dfs.replication will only apply to new files you create, but will not modify the replication factor for the already existing files.

    To change replication factor for files that already exist, you could run the following command which will be run recursively on all files in HDFS:

    hadoop dfs -setrep -w 1 -R /