I have small files coming into hdfs everyday. I am planning to use hadoop archive (HAR) but how can I archive these small files that comes into hdfs everyday. Eg: I might get 5 files today I need to archive them and tomorrow if I get 5 more files I need to append this into the previous days archive.
You cannot add files to the existing HAR files. You need to un-archive and re-archive or pool files for some days and create archive files moving forward.