Search code examples
bashsedrenamemv

Renaming multiple files in bash by removing prefix and suffix containing special characters %?=


I recently make a request against the Google Cloud Service API endpoint and wget a lot of files into one single folder. Owing to the fact that all sub-directories separator 0/ are being replaced by %2F with the addition of ?alt=media, all the downloaded files are contaminated with these strings. e.g.

hg38%2Fv0%2FHomo_sapiens_assembly38.dict?alt=media
hg19%2Fv0%2FHomo_sapiens_assembly19.fasta.alt?alt=media

I tried to test the following in bash and it returned the result i wanted:

echo "$hg19%2Fv0%2FHomo_sapiens_assembly19.fasta.alt?alt=media" | sed -e "s/^$hg19%2Fv0%2F//" -e "s/\?.*//g"

i.e. Homo_sapiens_assembly19.fasta.alt. Unfortunately when I scaled it up using,

for file in *; do 
    mv "$file" '$(echo "$file" | sed -e "s/^$hg19%2Fv0%2F//" -e "s/\?.*//g")' ; 
done

all the files turned into 1 file named "$file". I couldnt figure out why.

Please can anyone provide a solution to my problem? And if some of the files contain different repeats of "%2F", how can I elegantly only keep the string after the last "%2F" and string the "?alt=media" from the end in the same line?

Thank you in advance.


Solution

  • Use .* to match everything up to the last %2F.

    Put the command substitution inside double quotes, not single quotes. See Difference between single and double quotes in Bash

    Don't put $ before hg at the beginning.

    It's not a requirement, but sed commands are usually put in single quotes, unless you're using variables in the substitution.

    for file in *; do 
        mv "$file" "$(echo "$file" | sed -e 's/^hg.*%2F//' -e 's/\?.*//g')" ; 
    done