Search code examples
pythonpython-3.xsftpparamikopysftp

Download chunk of the large file using pysftp in Python


I have one use case in which I want to read only top 5 rows of a large CSV file which is present in one of my sftp server and I don't want to download the complete file to just read the top 5 rows. I am using pysftp in Python to interact with my SFTP server. Do we have any way in which I can download only the chunk of the file instead of downloading the complete file in pysftp?

If there are any other libraries in Python or any technique I can use, please guide me. Thanks


Solution

  • First, do not use pysftp. It's dead unmaintained project. Use Paramiko instead. See pysftp vs. Paramiko.

    If you want to read data from specific point in the file, you can open a file-like object representing the remote file using Paramiko SFTPClient.open method (or equivalent pysftp Connection.open) and then use it as if you were accessing data from any local file:

    • Use .seek to set read pointer to the desired offset.
    • Use .read to read data.
    with sftp.open("/remote/path/file", "r", bufsize=32768) as f:
        f.seek(offset)
        data = f.read(count)
    

    For the purpose of bufsize, see:
    Writing to a file on SFTP server opened using Paramiko/pysftp "open" method is slow