Search code examples
bashscriptingrenamefile-rename

How to remove all non-numeric characters from filename recursively


Here's the current folder structure:

/home/ubuntu/Desktop/pictures/
/home/ubuntu/Desktop/pictures/folder1
/home/ubuntu/Desktop/pictures/folder1/John Doe - 1234567.JPG

/home/ubuntu/Desktop/pictures/folder2
/home/ubuntu/Desktop/pictures/folder2/Homer Simpson - 7654321.jpg
/home/ubuntu/Desktop/pictures/folder2/Lisa Simpson - 321456.jpg

/home/ubuntu/Desktop/pictures/folder3
/home/ubuntu/Desktop/pictures/folder3/Foo Bar - 234123.JPG
/home/ubuntu/Desktop/pictures/folder3/Bar Foo - 876542.JPG

What I'd want is to build a script that'd loop through all the folders in the "pictures" folder and rename all "JPG" and "jpg" files to their numeric values so that a filename "John Doe - 1234567.JPG" would turn to "1234567.JPG".

I did try some shell scripting, but I got this working only when the jpg files are in one folder:

ubuntu@ubuntu:~/Desktop/pictures/in_one_folder$ ls
John Doe - 1234567.JPG          Foo Bar - 234123.JPG
Homer Simpson - 7654321.jpg     Bar Foo - 876542.JPG
Lisa Simpson - 321456.jpg       script.sh

Started this script:

for f in *JPG *jpg;
do
        file=$f
        remove_non_numeric=$(echo "$file" | sed 's/[^0-9]*//g')
        add_extension="$remove_non_numeric.jpg"
        echo "$add_extension"
        mv "$file" "$add_extension"
done

And here's the result:

ubuntu@ubuntu:~/Desktop/pictures/in_one_folder$ ls
1234567.jpg     234123.jpg
7654321.jpg     876542.jpg
321456.jpg      script.sh

So this did what it was supposed to. Now the question is, how could I set it to loop through the folders. Or if there's no way to modify the code that I came up with (a newbie trying to learn, haha), then I'd appreciate other ways to achieve the result. Although I'm trying to get this work on Linux, a Windows' approach would be fine also.

Thanks for helping!


Solution

  • Here's you code adjusted to work recursively:

    topdir=~/"Desktop/pictures/in_one_folder"
    find "$topdir" -type f -iname '*.jpg' -print0 |
        while IFS= read -r -d '' path; do
            dir="${path%/*}"
            file="${path##*/}"
            remove_non_numeric=$(echo "$file" | sed 's/[^0-9]*//g')
            add_extension="$remove_non_numeric.jpg"
            echo "$dir/$add_extension"
            mv "$path" "$dir/$add_extension"
        done
    

    It uses find to locate all files and then it process them one by one in a while loop.

    One way to make this slightly faster is by avoid using sed. You can delete the non numeric characters with pure bash as follows:

            remove_non_numeric="${file//[^0-9]/}"