Search code examples
pythonweb-scrapingbeautifulsouphttp-status-code-403

403 response in coches.net using requests


I'm very new on doing web scraping. I want to do a scraping on coches.net web to do some funny data analysis exercice, but the following code returns always a 403 response.

import requests
from bs4 import BeautifulSoup
import time

headers={'User-Agent':'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36'}
base_url = 'https://www.coches.net/segunda-mano/?pg={}&st=1'


for counter in range(1,80):
    url = base_url.format(counter)
    # Get links
    response = requests.get(url)
    print (response.status_code)
    soup = BeautifulSoup(response.content, "html.parser")
    blocks = soup.select('.mt-Card-body')
    print (blocks)
    time.sleep(1)

I've been looking at some web pages (indeed my code is strongly inspired by what I've found so far) and it seems like my code should be ok. Any help? How can I avoid the 403 response? Is it because of my code or just coches.net doesn't allow python scripts to acces?


Solution

  • You have create headers but don't use them. Try to use your user agent and you will have 200 status code

    response = requests.get(url, headers=headers)
    

    If I help you - please mark answer as correct