python python-3.x web-scraping python-requests-html

requests_html returns black

    Python 3.8.2 (default, Apr  8 2020, 14:31:25) 
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from requests_html import HTMLSession
>>> session = HTMLSession()
>>> r = session.get('https://www.sahibinden.com/ilan/emlak-konut-gunluk-kiralik-holiday-business-suit-lux-otel-konforunda-suit-daireler-803346031/detay')
>>> r.html.find("#classifiedId")
[]

I ran this code but the output is empty. I tried r.html.render() but result didn't change. I also tried finding it with xpath but still no result. How can I fix that?

Solution

The site needs that you specify User-Agent and a cookie named "s3IssGuY1". If this cookie needs to be changed over time (and when) I don't know, but you can change it accordingly (from Firefox/Chrome developer tools):

import requests
from bs4 import BeautifulSoup


headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:77.0) Gecko/20100101 Firefox/77.0'}
url = 'https://www.sahibinden.com/ilan/emlak-konut-gunluk-kiralik-holiday-business-suit-lux-otel-konforunda-suit-daireler-803346031/detay'
cookies = {'s3IssGuY1': 'A_Lne-ByAQAAWJ_crjFYgFyVj0loVQQA3jwlYwVTH-vnpfLSbIkEJkwRS9NDAVX4a-mcuNvjwH8AADQwAAAAAA=='}
soup = BeautifulSoup(requests.get(url, headers=headers, cookies=cookies).content, 'html.parser')

for st, sp in zip(soup.select('.classifiedInfoList strong'), soup.select('.classifiedInfoList span')):
    print('{:<30} {}'.format(st.get_text(strip=True), sp.get_text(strip=True)))

Prints:

İlan No                        803346031
İlan Tarihi                    23 Haziran 2020
Emlak Tipi                     Günlük Kiralık Daire
m² (Brüt)                      35
m² (Net)                       30
Oda Sayısı                     Stüdyo (1+0)
Bulunduğu Kat                  5
Kat Sayısı                     5
Isıtma                         Merkezi
Banyo Sayısı                   1
Site İçerisinde                Hayır
Kimden                         Emlak Ofisinden