I check most of the posts, but didnt find a reply for my small quation.
<div class="input-box">
<select name="super_attribute[138]" id="attribute138" class="required-entry super-attribute select form-control" onchange="notifyMe(this.value, this.options[this.selectedIndex].innerHTML);">
<option value="">Choose an Option...</option>
<option value="17" price="0">M (in stock) </option>
<option value="18" price="0">L (out of stock) </option>
<option value="15" price="0">XL (in stock) </option>
<option value="52" price="0">XXL (in stock) </option>
</select>
</div>
My Python Code is:
items = soup.select('option[value]')
values = [item.get('value') for item in items]
textvalues = [item.text for item in items]
print(textvalues)
And Output is : ['select', '(In-Stock)', '(Out-Stock)', '(In-Stock)', '(In-Stock)']
My request is i also need the other values (SizeValue & SizeName): 17 & M / 18 & L / 15 & XL / 52 & XXL
If i removed the .text , i have this output:
<option value="">select</option>, <option value="200@#-(In-Stock)@#-https://store.alsabihmarine.com/index.php/diving-equipments/wetsuits/camouflage-hooded-suits-220.html@#-">(In-Stock)</option>, <option value="201@#-(Out-Stock)@#-https://store.alsabihmarine.com/index.php/diving-equipments/wetsuits/camouflage-hooded-suits-220.html@#-">(Out-Stock)</option>, <option value="202@#-(In-Stock)@#-https://store.alsabihmarine.com/index.php/diving-equipments/wetsuits/camouflage-hooded-suits-220.html@#-">(In-Stock)</option>, <option value="203@#-(In-Stock)@#-https://store.alsabihmarine.com/index.php/diving-equipments/wetsuits/camouflage-hooded-suits-220.html@#-">(In-Stock)</option>
Thanks for your help in advance.
It's quite simple, just add a +
and also call item.text
in your list-comprehension.
Instead of:
values = [item.get('value') for item in items]
use:
values = [item.get('value') + item.get_text(strip=True) for item in items[1:]]
print(values)
EDIT: The data is loaded dynamically so requests
doesn't support it. But the data is available in JSON format on the website. You can extract it with a Regular Expression using the re
module:
import json
import re
import requests
url = "https://store.alsabihmarine.com/index.php/diving-equipments/wetsuits/camouflage-hooded-suits-220.html"
response = requests.get(url).content
regex_pattern = re.compile(r"Product\.Config\(({.*?})\);")
data = json.loads(regex_pattern.search(str(response)).group(1))
print(
[
product["id"] + product["label"]
for product in data["attributes"]["138"]["options"]
]
)
Output:
['17M (in stock) ', '18L (out of stock) ', '15XL (in stock) ', '52XXL (in stock) ']