python web-scraping url user-agent http-status-code-403

Python Web Scrapping Error 403 even with header User Agent

I'm a newbie learning Python. While using BeautifulSoup and Requests to scrap "https://batdongsan.com.vn/nha-dat-ban-tp-hcm" for collect data on housing price of my hometown, I get blocked by 403 error even though having tried Headers User Agent. Here is my code :

**url3 = "https://batdongsan.com.vn/nha-dat-ban-tp-hcm"

headers = {"User-Agent" : "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.114 Safari/537.36 Edg/103.0.1264.49"}

page = requests.get(url3, headers = headers)

print(page)**

Result : <Response [403]>

Have anyone tried and succeeded to bypass the same problem. Any help is highly appriciated.

Many thanks

Solution

import cloudscraper

scraper = cloudscraper.create_scraper()

soup = BeautifulSoup(scraper.get("https://batdongsan.com.vn/nha-dat-ban-tp-hcm").text)

print(soup.text) ## do what you want with the response

You can install cloudscraper with pip install cloudscraper