Search code examples
pythonweb-scrapingurluser-agenthttp-status-code-403

Python Web Scrapping Error 403 even with header User Agent


I'm a newbie learning Python. While using BeautifulSoup and Requests to scrap "https://batdongsan.com.vn/nha-dat-ban-tp-hcm" for collect data on housing price of my hometown, I get blocked by 403 error even though having tried Headers User Agent. Here is my code :

**url3 = "https://batdongsan.com.vn/nha-dat-ban-tp-hcm"

headers = {"User-Agent" : "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.114 Safari/537.36 Edg/103.0.1264.49"}

page = requests.get(url3, headers = headers)

print(page)**

Result : <Response [403]>

Have anyone tried and succeeded to bypass the same problem. Any help is highly appriciated.

Many thanks


Solution

  • import cloudscraper
    
    scraper = cloudscraper.create_scraper()
    
    soup = BeautifulSoup(scraper.get("https://batdongsan.com.vn/nha-dat-ban-tp-hcm").text)
    
    print(soup.text) ## do what you want with the response
    

    You can install cloudscraper with pip install cloudscraper