I want to get a product url on this website: https://stockx.com/search?s=555088-105
But i try this code
link = soup.find("div", class_ = 'browse-grid loading undefined')
print(link)
It just return
<div class="browse-grid loading undefined"><div class="back-to-top"><div class="back-to-top-container"><img alt="back to top" src="https://stockx-assets.imgix.net/svg/icons/back-to-top.svg?auto=compress,format"/><span>TOP</span></div></div><div class="browse-grid"><div class="no-results">NOTHING TO SEE HERE! PLEASE CHANGE YOUR FILTERS OR <a href="/product-suggestion">Suggest a Product</a></div></div></div>
or i try this, it just print all the url without the url I want
a_tags = soup.find_all('a')
for tag in a_tags:
print(tag.get('href'))
How can I get the url in my picture?
The URL you see on the page is loaded from external source via JavaScript - so beautifulsoup
doesn't see it. You can simulate the Ajax requests with requests
module:
import re
import json
import requests
url = "https://stockx.com/search?s=555088-105"
api_url = "https://stockx.com/api/browse"
id_ = re.search(r"s=([\d-]+)", url).group(1)
params = {
"": "",
"currency": "EUR",
"_search": id_,
"dataType": "product",
}
headers = {
"User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:88.0) Gecko/20100101 Firefox/88.0",
"Referer": url,
}
data = requests.get(api_url, params=params, headers=headers).json()
# uncomment this to print all data:
# print(json.dumps(data, indent=4))
for product in data["Products"]:
print("https://stockx.com/" + product["urlKey"])
Prints:
https://stockx.com/air-jordan-1-retro-high-dark-mocha