I am running a process producing arbitrary number of files in an arbitrary number of sub-folders. I am interested in the number of distinct sub-folders and currently am trying to solve this with bash and find (I do not want to use a scripting language).
So far I have:
find models/quarter/ -name settings.json | wc -l
However, this obviously does not consider the structure of the result from find and just counts all files returned.
Sample of the find return:
models/quarter/1234/1607701623/settings.json
models/quarter/1234/1607701523/settings.json
models/quarter/3456/1607701623/settings.json
models/quarter/3456/1607702623/settings.json
models/quarter/7890/1607703223/settings.json
I am interested in the number of distinct folders in top-folder models/quarter
, so the appropriate result for the sample above would be 3 (1234
, 3456
, 7890
). It is a requirement that the folders to be counted contain a sub-folder (which is a Unix timestamp as you might have recognized) and the sub-folder contains the file settings.json
.
My guts tell me it should be possible, e.g. with awk
, but I am certainly no bash pro. Any help is greatly appreciated, thanks.
find models/quarter/ -name settings.json | awk -F\/ '{ if (strftime("%s",$4) == $4) { fil[$3]="" } } END { print length(fil) }'
Using awk. Pass the output of find to awk and set / as the field separator. Check that the 4th field is a timestamp and then if it is, create an array with the third field as the index. At the end, print the length of the array fil.