Search code examples
pythonpython-3.xweb-scrapingpython-requests-html

requests_html returns black


    Python 3.8.2 (default, Apr  8 2020, 14:31:25) 
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from requests_html import HTMLSession
>>> session = HTMLSession()
>>> r = session.get('https://www.sahibinden.com/ilan/emlak-konut-gunluk-kiralik-holiday-business-suit-lux-otel-konforunda-suit-daireler-803346031/detay')
>>> r.html.find("#classifiedId")
[]

I ran this code but the output is empty. I tried r.html.render() but result didn't change. I also tried finding it with xpath but still no result. How can I fix that?


Solution

  • The site needs that you specify User-Agent and a cookie named "s3IssGuY1". If this cookie needs to be changed over time (and when) I don't know, but you can change it accordingly (from Firefox/Chrome developer tools):

    import requests
    from bs4 import BeautifulSoup
    
    
    headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:77.0) Gecko/20100101 Firefox/77.0'}
    url = 'https://www.sahibinden.com/ilan/emlak-konut-gunluk-kiralik-holiday-business-suit-lux-otel-konforunda-suit-daireler-803346031/detay'
    cookies = {'s3IssGuY1': 'A_Lne-ByAQAAWJ_crjFYgFyVj0loVQQA3jwlYwVTH-vnpfLSbIkEJkwRS9NDAVX4a-mcuNvjwH8AADQwAAAAAA=='}
    soup = BeautifulSoup(requests.get(url, headers=headers, cookies=cookies).content, 'html.parser')
    
    for st, sp in zip(soup.select('.classifiedInfoList strong'), soup.select('.classifiedInfoList span')):
        print('{:<30} {}'.format(st.get_text(strip=True), sp.get_text(strip=True)))
    

    Prints:

    İlan No                        803346031
    İlan Tarihi                    23 Haziran 2020
    Emlak Tipi                     Günlük Kiralık Daire
    m² (Brüt)                      35
    m² (Net)                       30
    Oda Sayısı                     Stüdyo (1+0)
    Bulunduğu Kat                  5
    Kat Sayısı                     5
    Isıtma                         Merkezi
    Banyo Sayısı                   1
    Site İçerisinde                Hayır
    Kimden                         Emlak Ofisinden