Search code examples
pythonsftppysftp

Download new files only, ignoring file extensions, from remote SFTP server in Python


from datetime import datetime
import pysftp
import fnmatch
import os
from stat import S_IMODE, S_ISDIR, S_ISREG

Hostname = "Hostname"
Username = "Username"
Password = "Password"
cnopts = pysftp.CnOpts()
cnopts.hostkeys = None
with pysftp.Connection(host=Hostname, username=Username, password=Password,
                       cnopts=cnopts) as sftp:
    print("Connection successfully established ... ")

    local_dir = r"D:\testing"
    remote_dir = r'C:\Users\admin\Documents\personal'

    for entry in sftp.listdir(remote_dir):
        root, ext = os.path.splitext(entry)
        for entry1 in sftp.listdir(local_dir):
            root1, ext1 = os.path.splitext(entry1)
            if root == root1:
                if ext != ext1:
                    pass
            elif root != root1:
                remotepath = os.path.join(remote_dir, entry)
                localpath = os.path.join(local_dir, entry)
                sftp.get(remotepath, localpath, preserve_mtime=False)

I'm trying to export files from SFTP server to local. In order to do that I need to compare files by filename and file extension from server and local folder.

For instance abc.xlsx, abc.csv, adc.txt from server and local folder has got abc.xlsx, then my script shouldn't copy any files with same name and same extension or different extension.

Here I need to compare the filenames and extension

  • if the same name then don't download
  • if same name and different extension don't download

Solution

    • You have to use os.listdir for the local path.
    • There's no need to re-read the local folder every time (unless you somehow aim to cater for multiple files with same base name in the remote directory).
    • Your code for identifying new files is not correct either.
    • You should not use os.path.join for SFTP paths.

    This should do:

    localfiles = os.listdir(local_dir)
    for rentry in sftp.listdir(remote_dir):
        rroot, rext = os.path.splitext(rentry)
        found = False
        for lentry in localfiles:
            lroot, lext = os.path.splitext(lentry)
            if rroot == lroot:
                found = True
                break
        if not found:
            remotepath = remote_dir + "/" rentry
            localpath = os.path.join(local_dir, rentry)
            sftp.get(remotepath, localpath, preserve_mtime=False)
    

    Though these days, you should not use pysftp, as it is dead. Use Paramiko directly instead. See pysftp vs. Paramiko. The above code will work with Paramiko too with its SFTPClient.listdir.


    Obligatory warning: Do not set cnopts.hostkeys = None, unless you do not care about security. For the correct solution see Verify host key with pysftp.