Search code examples
pythonxmlftpgziptarfile

extracting list of xml files from tar.gz file from ftp server


I need to extract a list of xml files that are in a tar.gz file that I'm trying to read.

I tried this:

import os
from ftplib import FTP

def writeline(data):
    filedata.write(data)
    filedata.write(os.linesep)

ftp = FTP('ftp.my.domain.com')
ftp.login(user="username",passwd="password")
ftp.cwd('inner_folder')
filedata = open('mytargz.tar.gz', 'w')
ftp.retrlines('RETR %s' % ftp.nlst()[0], writeline)

I used ftp.nlst()[0] because I have a list of tar.gz files in my ftp. It looks like the data that I'm receiving in my writeline callback is some weird symbols, and than the filedata.write(data) is throwing an error: {UnicodeEncodeError}'charmap' codec can't encode character '\x8b' in position 1: character maps to <undefined>. I can really use some help here..


Solution

  • I dont have an ftp server to try this with, but this should work:

    import os
    from ftplib import FTP
    
    def writeline(data):
        filedata.write(data)
    
    ftp = FTP('ftp.my.domain.com')
    ftp.login(user="username",passwd="password")
    ftp.cwd('inner_folder')
    filedata = open('mytargz.tar.gz', 'wb')
    ftp.retrbinary('RETR %s' % ftp.nlst()[0], writeline)
    



    note that we open the file with write binary 'wb' and we ask the ftp to return binary and not text and that our callback function only write without adding seperators