Search code examples
pythonpython-3.xpathstring-comparison

compare filenames and sizes with partial path


My final goal is to sync new files from location X to location Y.

The source path is /srv/dev-disk-by-uuid-5a792a17-de43-4e09-9c99-36e48c6c30aa/datafs

The destination path is /mnt/Usb/DataSync

I want to compare all the files at source path and its subfolders to be exact as the destination, so for example:

/srv/dev-disk-by-uuid-5a792a17-de43-4e09-9c99-36e48c6c30aa/datafs/photo/png21.png

is the same as

/mnt/Usb/DataSync/photo/png21.png

So, this example, I want to compare only photo directory and its files.

If this is the same filename but different size (the file changed), I want to copy the file from source to destination.

I searched about glob and os.walk but all solutions are in case the filename is with full path, but I need only partial path. glob isn't my solution because it searches only files in the specific directory, and os.walk is not my solution because it needs all the path.

My question:

How can I compare if two files are the same filename with partial path (like the example above - 'photo' directory)?

def SyncFiles(DstDiskPath):
    SrcDisk = /srv/dev-disk-by-uuid-5a792a17-de43-4e09-9c99-36e48c6c30aa/datafs/

    print("[+] Drive OK, syncing")
    for path, CurrentDirectory, files in os.walk(SrcDisk)
        for file in files:
            if (file in DstDiskPath):
                if os.path.getsize((os.path.join(path, file)) != os.path.getsize((os.path.join(path, file))
                    shutil.copy2((os.path.join(path, file), (os.path.join(DstDiskPath, file)))

Solution

  • Just trim off the part you want to ignore.

    def SyncFiles(DstDiskPath):
        # Syntax fix, need quotes around path
        SrcDisk = "/srv/dev-disk-by-uuid-5a792a17-de43-4e09-9c99-36e48c6c30aa/datafs/"
    
        print("[+] Drive OK, syncing")
        for path, CurrentDirectory, files in os.walk(SrcDisk)
            # remove starting point prefix from path and replace it with destination
            dstpath = os.path.join(DstDiskPath, path[len(SrcDisk)+1:])
            for file in files:
                # fix: "file in path" checks if the string is a substring
                srcfile = os.path.join(path, file)
                dstfile = os.path.join(dstpath, file)
                if os.path.exists(dstfile):
                    if os.path.getsize(srcfile) != os.path.getsize(dstfile):
                        shutil.copy2(srcfile, dstfile)
    

    The file size comparison is not a reliable way to check whether a file has changed, though. You probably want to use an existing tool like rsync for your backups.