I am trying to extract some data from the website. But the source of the website does not have classes for each item. I need the price quantitiy and size of the products. Can you please guide me to find a solution for my problem?
I though that I can use the scroll menu to extract data for each products.Because that is the only class that I saw on the source of the page. To sum up, I need to get data named as data-comprice data-quantity, and data-size. But could not find a solution yet. I am sharing my basic code and a part of the source page. Thanks in advance!
Source:
<div class="scrollmenu">
<div data-value="2' x 3'" class="swatch-element 2-x-3 soldout ">
<input data-comprice="75.01" data-curprice="30.00" data-size="2' x 3'" data-quantity="0" data-sku="AAAA0536-EPERNAY-23" data-price="30.00" data-title="2' x 3'" type="radio" name="id" value="31781284839506" id="radio_31781284839506"/>
<label style="height:75px!important; min-width:135px!important; padding: 0 0px!important;" for="radio_31781284839506">
<p style="color: black; margin-bottom:0; font-size:15px; font-weight: bold;"> 2' x 3'</p> <br> <p style="color: #535258; margin-bottom:0; margin-top:-45px; text-decoration:line-through;"> $75.01 </p> <br> <p style="margin-top:-48px; margin-bottom:2px; color:#584c98; font-weight:bold; font-size: 20px;"> $30.00 </p>
</label>
</div>
<div data-value="2'7" x 7'3"" class="swatch-element 27-x-73 soldout ">
<input data-comprice="134.81" data-curprice="53.92" data-size="2'7" x 7'3"" data-quantity="0" data-sku="AAAA0536-EPERNAY-2773" data-price="53.92" data-title="2'7" x 7'3"" type="radio" name="id" value="31781284872274" id="radio_31781284872274"/>
<label style="height:75px!important; min-width:135px!important; padding: 0 0px!important;" for="radio_31781284872274">
<p style="color: black; margin-bottom:0; font-size:15px; font-weight: bold;"> 2'7" x 7'3"</p> <br> <p style="color: #535258; margin-bottom:0; margin-top:-45px; text-decoration:line-through;"> $134.81 </p> <br> <p style="margin-top:-48px; margin-bottom:2px; color:#584c98; font-weight:bold; font-size: 20px;"> $53.92 </p>
</label>
</div>
My initial code block:
from bs4 import BeautifulSoup
import requests
import pandas as pd
webpage = requests.get('https://markandday.com/products/epernay-cottage-denim-rug')
sp = BeautifulSoup(webpage.content, 'html.parser')
for datapage in sp.find('div',attrs={'class':'scrollmenu'}):
Result=print (datapage)
type(Result)
You can use find_all
method on input
tag to get attribute from tag and for that .get()
method is used
from bs4 import BeautifulSoup
html=""" <div class="scrollmenu">
<div data-value="2' x 3'" class="swatch-element 2-x-3 soldout ">
<input data-comprice="75.01" data-curprice="30.00" data-size="2' x 3'" data-quantity="0" data-sku="AAAA0536-EPERNAY-23" data-price="30.00" data-title="2' x 3'" type="radio" name="id" value="31781284839506" id="radio_31781284839506"/>
<label style="height:75px!important; min-width:135px!important; padding: 0 0px!important;" for="radio_31781284839506">
<p style="color: black; margin-bottom:0; font-size:15px; font-weight: bold;"> 2' x 3'</p> <br> <p style="color: #535258; margin-bottom:0; margin-top:-45px; text-decoration:line-through;"> $75.01 </p> <br> <p style="margin-top:-48px; margin-bottom:2px; color:#584c98; font-weight:bold; font-size: 20px;"> $30.00 </p>
</label>
</div>
<div data-value="2'7" x 7'3"" class="swatch-element 27-x-73 soldout ">
<input data-comprice="134.81" data-curprice="53.92" data-size="2'7" x 7'3"" data-quantity="0" data-sku="AAAA0536-EPERNAY-2773" data-price="53.92" data-title="2'7" x 7'3"" type="radio" name="id" value="31781284872274" id="radio_31781284872274"/>
<label style="height:75px!important; min-width:135px!important; padding: 0 0px!important;" for="radio_31781284872274">
<p style="color: black; margin-bottom:0; font-size:15px; font-weight: bold;"> 2'7" x 7'3"</p> <br> <p style="color: #535258; margin-bottom:0; margin-top:-45px; text-decoration:line-through;"> $134.81 </p> <br> <p style="margin-top:-48px; margin-bottom:2px; color:#584c98; font-weight:bold; font-size: 20px;"> $53.92 </p>
</label>
</div>
"""
soup=BeautifulSoup(html,"html.parser")
inps=soup.find("div",class_="scrollmenu").find_all("input")
for inp in inps:
print(inp)
# inp['data-comprice'] you can also use this
print(inp.get("data-comprice"))
print(inp.get("data-curprice"))
print(inp.get("data-quantity"))
print(inp.get("data-size"))
Output:
<input data-comprice="75.01" data-curprice="30.00" data-price="30.00" data-quantity="0" data-size="2' x 3'" data-sku="AAAA0536-EPERNAY-23" data-title="2' x 3'" id="radio_31781284839506" name="id" type="radio" value="31781284839506"/>
75.01
30.00
0
2' x 3'
<input 7'3""="" data-comprice="134.81" data-curprice="53.92" data-price="53.92" data-quantity="0" data-size="2'7" data-sku="AAAA0536-EPERNAY-2773" data-title="2'7" x 7'3"" id="radio_31781284872274" name="id" type="radio" value="31781284872274" x=""/>
134.81
53.92
0
2'7
For website:
from bs4 import BeautifulSoup
import requests
html = requests.get('https://markandday.com/products/epernay-cottage-denim-rug')
soup=BeautifulSoup(html.text,"html.parser")
inps=soup.find("div",class_="scrollmenu").find_all("input")
for inp in inps:
print(inp.get("data-comprice"))
print(inp.get("data-curprice"))
print(inp.get("data-quantity"))
print(inp.get("data-size"))