Search code examples
unixawksedcut

Unix Command to check for missing file in sequence


Below is file format in a folder.

File format - fact_type_<key>_partid
fact_type_123_1
fact_type_123_2
fact_type_123_3
fact_type_123_4
fact_type_124_1
fact_type_124_2
fact_type_124_3
fact_type_124_4
..
fact_type_130_1

Each key should have 4 files (i.e Key1 should have 4 files ending with 1, 2, 3 and 4).

Keys should be in sequence, for above example next file should be fact_type_125_1

Above files are loaded from an external process and the next process will fail if we don't have all the files between start and end key (4 files for each key and all keys starting 123 till 130).

Right now am using cut command and copy the data to excel and then find out any missing keys

ls -1a | cut -d '_' -f3 | sort | uniq 

Please help me with the command to validate this within the folder.


Solution

  • With bash and GNU sort:

    for f1 in fact_type_*; do
      echo "${f1%_[0-9]}"
    done | sort -u |\
    while read -r f2; do
      for ((i=1; i<=4; i++)); do
        f="${f2}_${i}"
        [[ ! -e "$f" ]] && echo "missing $f"
      done
    done
    

    Output (e.g.):

    missing fact_type_126_4
    missing fact_type_127_1
    missing fact_type_127_2
    missing fact_type_127_4