I have multiple urls that differ in query strings parameters, maily in days, for instance:
urls = [f'https://example.com?query=from-{x+1}d+TO+-{x}d%data' for x in range(10)]
I want to write the content of all these urls to just one file. I tried with urllib.requests:
import urllib.request
key = "some value"
requests = urllib.request.Request([url for url in urls], headers={"key":key})
<urllib.request.Request object at 0x7f48e8381490>
but the first pitfall is that 'Request' object is not iterable
responses = urllib.request.urlopen([request for request in requests])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'Request' object is not iterable
Ideally the result could go to a file as below:
data = open('file_name', 'a')
data.write([response.read() for response in responses])
I also tried with requests lib
import requests
test = requests.Session()
r = test.get([url for url in urls], headers={"key":key})
but this fails with
raise InvalidSchema("No connection adapters were found for '%s'" % url)
requests.exceptions.InvalidSchema: No connection adapters were found for <list of urls>
Is there a way to get the content of these urls with headers and to send it to a file?
I suppose you might want to do something like this:
import urllib.request
with open("file_name", "a") as data:
for url in urls:
req = urllib.request.Request(url, headers={"key": "key"})
with urllib.request.urlopen(req) as response:
html = response.read()
data.write(html)