I just started using Python, I am trying to make a program that writes the lyrics of a song on the screen opened from the internet "www....../lyrics.txt". My first code:
import urllib.request
lyrics=urllib.request.urlopen("http://hereIsMyUrl/lyrics.txt")
text=lyrics.read()
print(text)
When I activated this code, it didn't give me the lyrics as they are written on the website, it gave me new line commands '\r\n' at all the places that should have been new lines and gave me all the lyrics in a long messy string. For example: Some lyrics here\r\nthis should already be the next line\r\nand so on.
I searched the internet for codes to replace the '\r\n' commands with new lines and tried the following:
import urllib.request
lyrics=urllib.request.urlopen("http://hereIsMyUrl/lyrics.txt")
text=lyrics.read()
text=text.replace("\r\n","\n")
print(text)
I hoped it would atleast replace something, but instead it gave me a runtime-error:
TypeError: expected bytes, bytearray or buffer compatible object
I searched the internet about that error, but I didn't find anything connected to opening files from the internet.
I have been stuck at this point for hours and have no idea how to continue. Please help! Thanks in advance!
Your example is not working because the data returned by the read
statement is a "bytes object". You need to decode it using an appropriate encoding. See also the docs for request.urlopen
, file.read
and byte array operations.
A complete working example is given below:
#!/usr/bin/env python3
import urllib.request
# Example URL
url = "http://ntl.matrix.com.br/pfilho/oldies_list/top/lyrics/black_or_white.txt"
# Open URL: returns file-like object
lyrics = urllib.request.urlopen(url)
# Read raw data, this will return a "bytes object"
text = lyrics.read()
# Print raw data
print(text)
# Print decoded data:
print(text.decode('utf-8'))
# If you still need newline conversion, you could use the following
text = text.decode('utf-8')
text = text.replace('\r\n', '\n')
print(text)