I'm new with Python(3) so please don't bash me :D
I'm using the following code in order to import a .txt file which contains different URLs so I can check their status code. In my example, I added 4 site URLs
here is the import.txt with just one URL:
https://site1.site
https://site2.site
https://site3.site
https://site4.site
https://site5.site
while this is the py script itself:
import requests
with open('import.txt', 'r') as f :
for line in f :
print(line)
#try :
r = requests.get(line)
print(r.status_code)
#except requests.ConnectionError :
# print("failed to connect")
this is the response:
https://site1.site
https://site2.site
https://site3.site
https://site4.site
https://site5.site
400
Even though site3 and site4 are 301's while site5 has a failed to connect response I only receive a 400 response which applies to all of the submitted URLs.
If I request.head for each one of those URLs using the following script then I receive the correct page status code('Moved Permantly' for the example below). This is the single request script:
import requests
try:
r = requests.head("http://site3.net/")
if r.status_code == 200:
print('Success!')
elif r.status_code == 301:
print('Moved Permanently')
elif r.status_code == 404:
print('Not Found')
# print(r.status_code)
except requests.ConnectionError:
print("failed to connect")
kudos to What’s the best way to get an HTTP response code from a URL?
Your call to requests.get()
is outside the for
loop, and so is only executed once. Try indenting the relevant lines, like so:
import requests
with open('import;txt', 'r') as f :
for line in f :
print(line)
#try :
r = requests.get(line)
print(r.status_code)
#except requests.ConnectionError :
# print("failed to connect")
Ps. I suggest you use 4-space indents. That way, errors like this are easier to spot.