Search code examples
shellunixfindmv

Finding files by type and renaming them based on their parent directory


So I've rigorously looked through the internet as best I can to attempt to find something to help me remedy the problem I'm currently experiencing.

For example I have a file with many directories and within these both documents and images are contained.

My goal is rename these files to being based on their parent folder, for example:

/main/secondary/file

as all my files are already generically named, I wish to be able to rename my images to secondary0001.jpg secondary0002.jpg and so on.

I've been looking all over and attempting to use all manner of methods to create a working script.

Currently I feel this may be my best effort so far.

find $2 -type f -iname IMG_[0-9][0-9][0-9][0-9].jpg -exec mv -n {}$dirname {}.jpg\; 

$2 contains the folder of my overall folder, so $2 would equate to Alpha/Primary/Secondary/file

I'd really appreciate any kind of assistance, thanks.


Solution

  • Assuming that your image file names don't contain white space and your folder names don't contain white space (so there's no need for extreme antics to deal with extremely awkward file names), then you can consider:

    find "$directory" -type f -iname 'IMG_[0-9][0-9][0-9][0-9].jpg' -print |
    while read file
    do
        base=$(basename "$file")
        dir=$(dirname "$file")
        bdir=$(basename "$dir")
        suffix=$(echo "$base" | sed 's/^[Ii][Mm][Gg]_//')
        mv "$file" "$dir/$bdir$suffix"
    done
    

    I didn't say anything about efficient. Since you didn't tag this with Bash or Ksh, I didn't assume any of their facilities for variable editing. Apart from the use of $(…) in lieu of back-ticks `…` and the -iname option to find, this would work with essentially any shell derived from the Bourne shell in the last 20 years or so.

    If you decide you need spaces etc in your directory or file names, you will need to review the code. It is likely mostly safe (because it uses double quotes around variable references like "$file"), but you need to really worry if your file names or directory names can contain newlines.


    Using your method I've now got around to a method of renaming these files. However, when I rename them based on their directory, I'm writing over each file and losing many. Is there a way to avoid this such as adding digits to the end of the filename?

    1. Test by putting an echo in front of the mv so you know what will happen, without actually making it happen.
    2. I think you must have either modified the code or have a slightly different situation from what was reasonably inferred. There's an example below with a set of empty files in a brand new junk directory hierarchy. The input names are unique per directory; the output names are unique per directory; there's no way for the script to generate collisions and lose data unless there are already files using the revised naming scheme present in the directory. Even if you move the files up a level, the names should all be unique because the subdirectories were unique in the first place.

    Example run:

    $ mkdir junk
    $ cd junk
    $ for dir in primary secondary tertiary
    > do (mkdir $dir; cd $dir; touch $(seq -f 'IMG_%04.0f.jpg' 1 10))
    > done
    $ ls
    primary   secondary tertiary
    $ ls *
    primary:
    IMG_0001.jpg IMG_0002.jpg IMG_0003.jpg IMG_0004.jpg IMG_0005.jpg IMG_0006.jpg IMG_0007.jpg IMG_0008.jpg IMG_0009.jpg IMG_0010.jpg
    
    secondary:
    IMG_0001.jpg IMG_0002.jpg IMG_0003.jpg IMG_0004.jpg IMG_0005.jpg IMG_0006.jpg IMG_0007.jpg IMG_0008.jpg IMG_0009.jpg IMG_0010.jpg
    
    tertiary:
    IMG_0001.jpg IMG_0002.jpg IMG_0003.jpg IMG_0004.jpg IMG_0005.jpg IMG_0006.jpg IMG_0007.jpg IMG_0008.jpg IMG_0009.jpg IMG_0010.jpg
    $ directory=.
    $ find "$directory" -type f -iname 'IMG_[0-9][0-9][0-9][0-9].jpg' -print |
    > while read file
    > do
    >     base=$(basename "$file")
    >     dir=$(dirname "$file")
    >     bdir=$(basename "$dir")
    >     suffix=$(echo "$base" | sed 's/^[Ii][Mm][Gg]_//')
    >     mv "$file" "$dir/$bdir$suffix"
    > done
    $ ls
    primary   secondary tertiary
    $ ls *
    primary:
    primary0001.jpg primary0003.jpg primary0005.jpg primary0007.jpg primary0009.jpg
    primary0002.jpg primary0004.jpg primary0006.jpg primary0008.jpg primary0010.jpg
    
    secondary:
    secondary0001.jpg secondary0003.jpg secondary0005.jpg secondary0007.jpg secondary0009.jpg
    secondary0002.jpg secondary0004.jpg secondary0006.jpg secondary0008.jpg secondary0010.jpg
    
    tertiary:
    tertiary0001.jpg tertiary0003.jpg tertiary0005.jpg tertiary0007.jpg tertiary0009.jpg
    tertiary0002.jpg tertiary0004.jpg tertiary0006.jpg tertiary0008.jpg tertiary0010.jpg
    $
    

    When I created 1000 files in each directory and timed the move, it took 46 seconds to rename the 3000 files (running on Mac OS X 10.10.4 with a hard disk and no SSD). That's a bit longer than I expected.

    Revising the script as shown below changes the runtime on 1000 files per directory down to 8 seconds (from 46), a speed up of about 5. That's a worthwhile improvement, but it still feels like the script is not running as fast as a modern Linux might — but that could be a combination of ancient machine, hard disks, HTFS file system, and Mac OS X overhead (the title bar of the window changes the currently running command name as the script is running, for example).

    directory='.'
    time find "$directory" -type f -iname 'IMG_[0-9][0-9][0-9][0-9].jpg' -print |
    while read file
    do
        #base=$(basename "$file")
        base=${file##*/}
        #dir=$(dirname "$file")
        dir=${file%/*}
        #bdir=$(basename "$dir")
        bdir=${dir#*/}
        #suffix=$(echo "$base" | sed 's/^[Ii][Mm][Gg]_//')
        suffix=${base/[Ii][Mm][Gg]_/}
        mv "$file" "$dir/$bdir$suffix"
    done
    

    To get much further improvement, I'd go to Perl and have it do the rename operation as a system call instead of invoking a separate program. This would cut down a lot more process overhead (there are still 3000 mv commands in the revised script, whereas Perl or an equivalent would have just one process for the whole move).

    Note that the parameter substitutions work because the names are constrained to be well-behaved (there is at least one slash in each of them; the root directory isn't named, etc). Edge cases dealt with by basename and dirname commands are not handled by the parameter substitutions. Be cautious about generalizing.