I have a direct connection to an SFTP server – the connection works without any problem and I can display files from the selected directory without any major problem. There are different files on the server, I have several functions to read them and below here is a piece of code concerning .pdf
files – I use pdfplumber
to read PDF files:
# SSH.connect configuration
sftp = ssh.open_sftp()
path = "/server_path/.."
for filename in sftp.listdir(path):
fullpath = path + "/" + filename
if filename.endswith('.pdf'):
#fullpath - full server path with filename - like /server_path/../file.pdf
#filename - filename without path - like file.pdf
with sftp.open(fullpath, 'rb') as fl:
pdf = pdfplumber.open(fl)
in this for
loop I want to read all the .pdf
files in the chosen directory - and it works for me on the localhost without any problem.
I tried to solve it this way with sftp.open(path, 'rb') as fl:
- but in this case this solution doesn't work and such an error code appears:
Traceback (most recent call last):
pdf = pdfplumber.open(fl)
return cls(open(path, "rb"), **kwargs)
TypeError: expected str, bytes or os.PathLike object, not SFTPFile
pdfplumber.open
takes as an argument the exact path to the file with its name – in this case fullpath. How can I solve this problem so that it works directly from the server? How to manage the memory in this case – because I understand that these files are somehow pulled into memory. Please give me some hints.
Paramiko SFTPClient.open
returns a file-like object.
To use a file-like object with pftplumber
, it seems that you can use load
function:
pdf = pdfplumber.load(fl)
You will also want to read this:
Reading file opened with Python Paramiko SFTPClient.open method is slow
As the Paramiko file-like object seems to work suboptimal when combined with pftplumber.load
function, as a workaround, you can download the file to memory instead:
flo = BytesIO()
sftp.getfo(fullpath, flo)
flo.seek(0)
pdfplumber.load(flo)
See How to use Paramiko getfo to download file from SFTP server to memory to process it