I need to download full page and parse it, but it creates some elements with help JavaScript. When i try to do this with help urllib i receive an html page without elements using JavaScript. How can I solve this problem?
import urllib.request as urlib
page = urlib.urlopen('https://www.example.com')
soup = BeautifulSoup(page, 'html5lib')
...
Trying:
colordiv = soup.select("div.pswp__item:nth-child(1) > div:nth-child(1) > img:nth-child(1)'")[0]
With:
https://www.electrictobacconist.com/smok-nord-p5831
Even though the page is rendered using JavaScript, the data is received via an ajax response in the background. All you have to do is make that request.
import requests
import re
url='https://www.electrictobacconist.com/smok-nord-p5831'
#get 5831
product_id=re.findall(r'\d+', url)[-1]
r=requests.get("https://www.electrictobacconist.com/ajax/get_product_options/{}".format(product_id))
print([x['value'] for x in r.json()['attributes'][0]['values']])
Output:
['Black/Blue', 'Black/White', 'Bottle Green', 'Full Black', 'Prism Gold', 'Prism Rainbow', 'Red', 'Resin Rainbow', 'Yellow/Purple', 'Blue/Brown', 'Red/Yellow', 'Red/Green', 'Black/White Resin']