import requests
from bs4 import BeautifulSoup as bs
my_url='https://www.olx.com.pk/item/oppo-f17-pro8128-iid-1034320813'
with requests.session() as s:
r=s.get(my_url)
page_html=bs(r.content,'html.parser')
safe=page_html.findAll('script')
print("The Length if Script is {0}:".format(len(safe)))
for i in safe:
if "+92" in str(i):
print(i)
I Want To Get that phone number that is actually present in windows.state using python script but I donot know how to parse the window.state.Will be very Thankful If you assist me that problem. Thanks in Advance!
As I have mentioned in the comments, the window.state
is present inside the 7th <script>
tag.
I extracted the contents of the script tag and did a string search for phoneNumber
, found it's index and was able to get the data that you need.
Extracting data from JSON would be easier but the data isn't in JSON format.
import bs4 as bs
import requests
url = 'https://www.olx.com.pk/item/oppo-f17-pro8128-iid-1034320813'
resp = requests.get(url)
# Convert the response text to HTML soup object
soup = bs.BeautifulSoup(resp.text, 'html.parser')
# Select the 7th script tag (that is where the data you need is present)
s = soup.findAll('script')[6]
# Extract the contents of script. This will be a string type.
f = s.contents[0]
# Find the index of substring "phoneNumber" - the data that you need.
idx = f.index('phoneNumber')
# Since you need the phone number, use string slicing and extract the data.
print(f[idx-1: idx + 28])
# Output
"phoneNumber":"+923077250739"