Search code examples
pythonpython-2.7urllibcarriage-return

Read file using urllib and write adding extra characters


I have a script that regularly reads a text file on a server and over writes a copy of the text to a local copy of the text file. I have an issue of the process adding extra carriage returns and an extra invisible character after the last character. How do I make an identical copy of the server file?

I use the following to read the file

for link in links:  
try:
    f = urllib.urlopen(link)
    myfile = f.read()
except IOError:
    pass    

and to write it to the local file

 f = open("C:\\localfile.txt", "w") 

try:
    f.write(myfile) 
except NameError:
    pass
finally:
    f.close()

This is how the file looks on the server !https://i.sstatic.net/inWfB.jpg

and this is how the file looks locally. Besides, an additional invisible character after the last 75 !https://i.sstatic.net/WW7xA.jpg

I have seen quite a few similar questions, but not sure how to handle the urllib to read in binary

Any solution please?


Solution

  • If you want to copy a remote file denoted by a URL to a local file i would use urllib.urlretrieve:

    import urllib
    urllib.urlretrieve("http://anysite.co/foo.gz", "foo.gz")