I'm stuck piping ftplib.FTP.retrlines
to csv.reader
...
FTP.retrlines
repeatedly calls a callback with a line in it, while csv.reader
expects an iterator which returns a string each time its __next__()
method is called.
How do I combine the two things together so that I can read and process the file without reading the whole file in advance and e.g storing it in a e.g. io.TextIOWrapper
?
My problem is FTP.retrlines
won't return until it consumed the whole file...
I'm not sure if there's not a better solution, but you can glue the FTP.retrlines
and csv.reader
together using iterable queue-like object. And as both the functions are synchronous, you have to run them on different threads in parallel.
Something like this:
from queue import Queue
from ftplib import FTP
from threading import Thread
import csv
ftp = FTP(host)
ftp.login(username, password)
class LineQueue:
_queue = Queue(10)
def add(self, s):
print(f"Queueing line {s}")
self._queue.put(s)
print(f"Queued line {s}")
def done(self):
print("Signaling Done")
self._queue.put(False)
print("Signaled Done")
def __iter__(self):
print("Reading lines")
while True:
print("Reading line")
s = self._queue.get()
if s == False:
print("Read all lines")
break
print(f"Read line {s}")
yield s
q = LineQueue()
def download():
ftp.retrlines("RETR /path/data.csv", q.add)
q.done()
thread = Thread(target=download)
thread.start()
print("Reading CSV")
for entry in csv.reader(q):
print(entry)
print("Read CSV")
thread.join()