Search code examples
pythonbeautifulsouphttp-status-code-403

Cant get the proper header functioning (403 Error)


I've been following a guide on YouTube but got stuck on getting the right User-Agent to get past the HTTP 403 Forbidden

This is the code I'm trying:

from bs4 import BeautifulSoup as soup
from urllib.request import urlopen as uReq

user_agent = 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.7) Gecko/2009021910 Firefox/3.0.7'

my_url = "https://www.pedidosya.com.ar/restaurantes/buenos-aires/monserrat/empanadas-delivery"
headers = {'User-Agent':user_agent,} 

uReq(my_url)

Solution

  • Perhaps this is an issue of not keeping a session going with the server that keeps data like cookies? I've ran into a problem of redirects timing out and the solution was to use requests.session(). My code ended up looking something like:

    import bs4
    import requests
    
    s = requests.session()
    s.headers['User-Agent'] = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1847.131 Safari/537.36'
    res = s.get('https://www.pedidosya.com.ar/restaurantes/buenos-aires/monserrat/empanadas-delivery')
    soup = bs4.BeautifulSoup(res.text, 'html.parser')
    

    When I run this there doesn't appear to be an error.

    The User-Agent I just found on a post online. I have no idea how it really works, but it makes my scripts work so I don't have to understand XD