python web-scraping python-requests html-parsing hidden

How to Get data-* attributes when web scraping using python requests (Python Requests Creating Some Issues)

How can I get the value of data-d1-value when I am using requests library of python?

The request.get(URL) function is itself not giving the data-* attributes in the div which are present in the original webpage.

The web page is as follows:

<div id="test1" class="class1" data-d1-value="150">
180
</div>

The code I am using is :

req = request.get(url)
soup = BeautifulSoup(req.text, 'lxml')
d1_value = soup.find('div', {'class':"class1"})
print(d1_value)

The result I get is:

<div id="test1" class="class1">
180
</div>

When I debug this, I found that request.get(URL) is not returning the full div but only the id and class and not data-* attributes.

How should I modify to get the full value?

For better example: For my case the URL is: https://www.moneycontrol.com/india/stockpricequote/oil-drillingexploration/oilnaturalgascorporation/ONG

And the Information of variable: The DIV CLASS is : class="inprice1 nsecp" and The value of data-numberanimate-value is what I am trying to fetch

Thanks in advance :)

Solution

EDIT

Website response differs in case of requesting it - In your case using requests the value you are looking for is served in this way:

<div class="inprice1 nsecp" id="nsecp" rel="92.75">92.75</div>

So you can get it from the rel or from the text:

soup.find('div', {'class':"inprice1"})['rel']
soup.find('div', {'class':"inprice1"}).get_text()

Example

import requests
from bs4 import BeautifulSoup

req = requests.get('https://www.moneycontrol.com/india/stockpricequote/oil-drillingexploration/oilnaturalgascorporation/ONG')

soup = BeautifulSoup(req.text, 'lxml')

print('rel: '+soup.find('div', {'class':"inprice1"})['rel'])
print('text :'+soup.find('div', {'class':"inprice1"}).get_text())

Output

rel: 92.75
text: 92.75

To get a response that display the source as you inspect it, you have to try selenium

Example

from selenium import webdriver
from bs4 import BeautifulSoup
from time import sleep

driver = webdriver.Chrome(executable_path='C:\Program Files\ChromeDriver\chromedriver.exe')
url = "https://www.moneycontrol.com/india/stockpricequote/oil-drillingexploration/oilnaturalgascorporation/ONG"

driver.get(url)
sleep(2)

soup = BeautifulSoup(driver.page_source, "lxml")
print(soup.find('div', class_='inprice1 nsecp')['data-numberanimate-value'])
driver.close()

To get the attribute value just add ['data-d1-value'] to your find()

Example

from bs4 import BeautifulSoup

html='''
<div id="test1" class="class1" data-d1-value="150">
180
</div>
'''

soup = BeautifulSoup(html, 'lxml')
d1_value = soup.find('div', {'class':"class1"})['data-d1-value']
print(d1_value)