Search code examples
pythonweb-scrapingurllib

How to download a page with lazy loading?


I need to download full page and parse it, but it creates some elements with help JavaScript. When i try to do this with help urllib i receive an html page without elements using JavaScript. How can I solve this problem?

import urllib.request as urlib

page = urlib.urlopen('https://www.example.com')
soup = BeautifulSoup(page, 'html5lib')
...

Trying:

colordiv = soup.select("div.pswp__item:nth-child(1) > div:nth-child(1) > img:nth-child(1)'")[0]

With:

https://www.electrictobacconist.com/smok-nord-p5831

Solution

  • Even though the page is rendered using JavaScript, the data is received via an ajax response in the background. All you have to do is make that request.

    import requests
    import re
    url='https://www.electrictobacconist.com/smok-nord-p5831'
    #get 5831
    product_id=re.findall(r'\d+', url)[-1]
    r=requests.get("https://www.electrictobacconist.com/ajax/get_product_options/{}".format(product_id))
    print([x['value'] for x in r.json()['attributes'][0]['values']])
    

    Output:

    ['Black/Blue', 'Black/White', 'Bottle Green', 'Full Black', 'Prism Gold', 'Prism Rainbow', 'Red', 'Resin Rainbow', 'Yellow/Purple', 'Blue/Brown', 'Red/Yellow', 'Red/Green', 'Black/White Resin']