I have a url that contains just a list. For example, the path https://somepath.com/dev/doc/72 returns simply (no html code):
[
"A/RES/72/1",
"A/RES/72/2",
"A/RES/72/3",
"A/RES/72/4"
]
I want to take the entire contents (including the square brackets) and make this into a list. Doing it by hand, I can copy/paste as a list like this:
docs = [
"A/RES/72/1",
"A/RES/72/2",
"A/RES/72/3",
"A/RES/72/4"
]
print(docs)
['A/RES/72/1', 'A/RES/72/2', 'A/RES/72/3', 'A/RES/72/4']
I would like to pass the content of the url to the list.
I tried the following
link = "https://somepath.com/dev/doc/72"
f = urlopen(link)
myfile = f.read()
print(myfile)
b'[\n "A/RES/72/1", \n "A/RES/72/2", \n "A/RES/72/3", \n "A/RES/72/4"\n]\n
It's a mess with new lines and not a list.
I'm guessing I would have to parse each line, removing the \n character, or something like this:
file.read().splitlines()
, but that seems overly complicated for such a simple input.
I've seen many solutions that parse .csv files, read inputs from each line, etc. But nothing to deal with a list that is already made and just needs to be called. Thanks for any help and pointers.
edit: I tried this:
import urllib.request # the lib that handles the url stuff
link = "https://somepath.com/dev/doc/72"
a=[]
for line in urllib.request.urlopen(link):
print(line.decode('utf-8'))
a.append(line)
a
The print
command gives me something close to what I want. But the append
command gives me a mess again:
[b'[\n',
b' "A/RES/72/1", \n',
b' "A/RES/72/2", \n',
b' "A/RES/72/3", \n',
b' "A/RES/72/4"\n',
b']\n']
Edit: Turns out the url is serving a JSON. The solution by fuglede below (https://stackoverflow.com/a/60119016/10764078):
import requests
docs = requests.get('https://somepath.com/dev/doc/72').json()
I'm going to do some reading on JSON.
Assuming what the site is sending you is JSON, with requests
, this would be obtainable through
import requests
docs = requests.get('https://somepath.com/dev/doc/72').json()