Search code examples
python-3.xglob

Alert and stop script if duplicate filename found


I am trying to find any duplicate filename found in several sub-folders and stop the script and prompt the user about the duplicate.

Using this example filenames:

Money_03012019_2019.csv
Money_04012019_2019.csv
Money_04012019_2019 - Copy.csv
Money_05012019_2019.csv
Money_06012019_2019.csv

The duplicate filename can be in any sub-folders (the - copy was just trying to mimic the filename but the code should be able to know that it is a duplicate based on the DDMMYYYY in the filename).

I have tried the following code:

flist = glob.glob('/**/Money_*_*.csv', recursive=True)    

if len(set(flist)) == len(flist):
    print('No Duplicate')
else:
    print('Duplicate Found')

But it just doesn't work even after I play around. I thought the set() and len method is the way to go for this (set will avoid duplicate by checking if it is equal to the length of the flist)?

Thanks.


Solution

  • glob.glob() returns a list of file paths, not names, so each thing in the list is unique.

    For Example:

    ['./Money.csv', './Money_Folder/Money.csv']
    

    A simple solution would this:

    for i in range(len(flist)):
        flist[i] = flist[i].split("/")[-1]
    

    This would remove the directory part of the path and leave only a list of filenames which could then be used to check for duplicates using len(set(flist)) == len(flist).