I am downloading tar files from a ftp server with the help of python. However, now I am facing problems and getting the error "ReadError: unexpected end of data". I assume my file got corrupted. I can open the files outside python with the comment 'wget' inside the terminal, however I would like to only stick to python. This is my code:
os.chdir(aod_ipng)
[urlretrieve('%s%s'%(url_ipng,x),'%s'%(x)) for x in ari]
for i in range(len(ari)):
fileName = '%s'%(ari[i])
ind = save_ipng[i].index('IVAOT')
h5f = save_ipng[i][ind:]
tfile = tarfile.open(fileName,'r|')
for t in tfile:
if t.name == '%s'%h5f:
f = tfile.extract(t)
Reliable downloads of large files over bad connections is not easy. If http range requests are supported then you can resume the download on broken connections.
A good start is to use the requests library and read the remote file as a stream. However disconnects and resumes might still have to be handled by you.
See this question for how to use that API
But please make sure that the file is indeed a tar. You can use libmagic for file format detection.
That file extension suggests a gzip not a tar.
import gzip
f = gzip.open('h5.gz', 'rb')
file_content = f.read()
f.close()