Search code examples
pythonregexglobshutil

Regular expression in Python Shutil integer range to move files


I have a folder with 12500 pictures. The filenames contain the numbers, so it looks like:

0.jpg
1.jpg
2.jpg
3.jpg
.
.
.12499.jpg

Now I want to move the files. Files with range 0-7999 should be copied to the first folder. Files 8000-9999 should be copied to the second folder and files with range 10000-12499 should be copied to the third folder.

First, I thought I could easily use [0-7999].jpg for the first folder, [8000-9999].jpg for the second and [10000-12499].jpg for the third. However, this does not work. I figured out the following code, based on the wildcards I know which are ? and *: The following code does work and does the job (please note that I commented out the shutil.copy, instead use print to check the result):

import glob
import shutil
dest_dir = "/tmp/folder1/"
for file in glob.glob('/tmp/source/?.jpg'):
    #shutil.copy(file, dest_dir)
    print(file)

dest_dir = "/tmp/folder1/"
for file in glob.glob('/tmp/source/??.jpg'):
    #shutil.copy(file, dest_dir)
    print(file)

dest_dir = "/tmp/folder1/"
for file in glob.glob('/tmp/source/???.jpg'):
    #shutil.copy(file, dest_dir)
    print(file)

dest_dir = "/tmp/folder1/"
for file in glob.glob('/tmp/source/[1-7]???.jpg'):
    #shutil.copy(file, dest_dir)
    print(file)

dest_dir = "/tmp/folder2/"
for file in glob.glob('/tmp/source/[8-9]???.jpg'):
    #shutil.copy(file, dest_dir)
    print(file)

dest_dir = "/tmp/folder3/"
for file in glob.glob('/tmp/source/?????.jpg'):
    #shutil.copy(file, dest_dir)
    print(file)

However, I would like to have an elegant solution for this. I googled regular expression with integer range and tried the following:

dest_dir = "/tmp/folder3/"
for file in glob.glob('/tmp/source/\b([0-9]|[1-9][0-9]|[1-9][0-9][0-9]|1000).jpg'):
    #shutil.copy(file, dest_dir)
    print(file)

This does not work. So how does a correct implementation look like? I need a solution for both, shutil.copy and shutil.move, but I think it is the same for both.


Solution

  • You can get all files (*.jpg) and then decide for each file where it should go

    import glob
    import shutil
    import os
    
    dest_dirs = {0:"/tmp/folder1/", 8000:"/tmp/folder2/", 10000:"/tmp/folder3/"}
    for file in glob.glob('*.jpg'):
        base = os.path.basename(file)  # remove path
        withoutext = os.path.splitext(base)[0]  # remove extension
        try:
            number = int(withoutext)
            for key, value in dest_dirs.items():
                if number >= key:
                    destination = value
            # shutil.copy(file, os.path.join(destination, base))
            print(file, os.path.join(destination, base))
        except ValueError:
            # file name is not a number
            pass