I want to read some CSV/Excel files on a secure SFTP folder, make some changes (fixed changes in each file like remove column 2) in those files, upload them to a Postgre DB and also the upload them to a different SFTP path in Python
What's the best way to this?
I have made a connection to the SFTP using pysftp library and am reading the Excel:
import pysftp
import pandas as pd
myHostname = "*****"
myUsername = "****"
myPassword = "***8"
cnopts =pysftp.CnOpts()
cnopts.hostkeys = None
sftp=pysftp.Connection(host=myHostname, username=myUsername,
password=myPassword,cnopts=cnopts)
print ("Connection succesfully stablished ... ")
sftp.chdir('test/test')
#sftp.pwd
a=[]
for i in sftp.listdir_attr():
with sftp.open(i.filename) as f:
df=pd.read_csv(f)
How should I proceed with the upload to DB and making those changes to the CSV permanent?
You have the download part done.
For the upload part, see How to Transfer Pandas DataFrame to .csv on SFTP using Paramiko Library in Python? – While it's for Paramiko, pysftp Connection.open
method behaves identically to Paramiko SFTPClient.open
, so the code is the same (though, you should not use pysftp).
Full code can be like:
with sftp.open("/remote/path/data.csv", "r+", bufsize=32768) as f:
# Download CSV contents from SFTP to memory
df = pd.read_csv(f)
# Modify as you need (just an example)
df.at[0, 'Name'] = 'changed'
# Upload the in-memory data back to SFTP
f.seek(0)
df.to_csv(f, index=False)
# Truncate the remote file in case the new version of the contents is smaller
f.truncate(f.tell())
The above updates the same file. If you want to upload to a different file, use this:
# Download CSV contents from SFTP to memory
with sftp.open("/remote/path/source.csv", "r") as f:
df = pd.read_csv(f)
# Modify as you need (just an example)
df.at[0, 'Name'] = 'changed'
# Upload the in-memory data back to SFTP
with sftp.open("/remote/path/target.csv", "w", bufsize=32768) as f:
df.to_csv(f, index=False)
For the purpose of bufsize
, see:
Writing to a file on SFTP server opened using Paramiko/pysftp "open" method is slow
Obligatory warning: Do not set cnopts.hostkeys = None
, unless you do not care about security. For the correct solution see Verify host key with pysftp.