Search code examples
arraysbashmacosspecial-charactersnested-loops

Loop thru a filename list and iterate thru a variable/array removing all strings from filenames with bash


I have a list of strings that I have in a variable and would like to remove those strings from a list of filenames. I'm pulling that string from a file that I can add to and modify over time. Some of the strings in the variable may include part of the item needed to be removed while the other may be another line in the list. Thats why I need to loop thru the entire variable list.

I'm familiar using a while loop to loop thru a list but not sure how I can loop thru each line to remove all strings from that filename.

Here's an example:

getstringstoremove=$(cat /text/from/some/file.txt)
echo "$getstringstoremove"

# Or the above can be an array
getstringstoremove=$(cat /text/from/some/file.txt)
declare -a arr=($getstringstoremove)

the above 2 should return the following lines

-SOMe.fil
(Ena)M-3_1
.So[Me].filEna)M-3_2
SOMe.fil(Ena)M-3_3

Here's the loop I was running to grab all filenames from a directory and remove anything other than the filenames

ls -l "/files/in/a/folder/" | awk -v N=9 '{sep=""; for (i=N; i<=NF; i++) {printf("%s%s",sep,$i); sep=OFS}; printf("\n")}' | while read line; do 
echo "$line"

returns the following result after each loop

# 1st loop 
ilikecoffee1-SOMe.fil(Ena)M-3_1.jpg
# iterate thru $getstringstoremove to remove all strings from the above file.
# 2nd loop
ilikecoffee2.So[Me].filEna)M-3_2.jpg
# iterate thru $getstringstoremove again
# 3rd loop
ilikecoffee3SOMe.fil(Ena)M-3_3.jpg
# iterate thru $getstringstoremove and again
done

the final desired output would be the following

ilikecoffee1.jpg
ilikecoffee2.jpg
ilikecoffee3.jpg

I'm running this in bash on Mac. I hope this makes sense as I'm stuck and can use some help.

If someone has a better way of doing this by all means it doesn't have to be the way I have it listed above.


Solution

  • You can get the new filenames with this awk one-liner:

    $ awk 'NR==FNR{a[$0];next} {for(i in a){n=index($0,i);if(n){$0=substr($0,0,n-1)substr($0,n+length(i))}}} 1' rem.txt files.lst
    

    This assumes your exclusion strings are in rem.txt and there's a files list in files.lst.

    Spaced out for easier commenting:

    NR==FNR {               # suck the first file into the indices of an array,
      a[$0]
      next
    }
    
    {
      for (i in a) {        # for each file we step through the array,
        n=index($0,i)       # search for an occurrence of this string,
        if (n) {            # and if found,
          $0=substr($0,0,n-1)substr($0,n+length(i))
                            # rewrite the line with the string missing,
        }
      }
    }
    
    1                       # and finally, print the line.
    

    If you stow the above script in a file, say foo.awk, you could run it as:

    $ awk -f foo.awk rem.txt files.lst
    

    to see the resultant files.

    Note that this just shows you how to build new filenames. If what you want is to do this for each file in a directory, it's best to avoid running your renames directly from awk, and use shell constructs designed for handling files, like a for loop:

    for f in path/to/*.jpg; do
      mv -v "$f" "$(awk -f foo.awk rem.txt - <<<"$f")"
    done
    

    This should be pretty obvious except perhaps for the awk options, which are:

    • -f foo.awk, use the awk script from this filename,
    • rem.txt, your list of removal strings,
    • -, a hyphen indicating that standard input should be used IN ADDITION to rem.txt, and
    • <<<"$f", a "here-string" to provide that input to awk.

    Note that this awk script will work with both gawk and the non-GNU awk that is included in macos.