The problem that I am having is that the data needed is not showing up when running the Python code. It is visbile when I "inspect element" on Chrome but not "View Source".
My code:
import bs4 as bs
import urllib
import urllib.request
url='https://ethplorer.io/address/0x8b353021189375591723e7384262f45709a3c3dc'
page=urllib.request.urlopen(url)
soup=bs.BeautifulSoup(page,'html.parser')
cat=0
for category in soup.findAll('td',{'class':'list-field'}):
print(category)
cat=cat+1
It pulls out the needed line
<td class="list-field" id="address-token-holdersCount"></td>
However it has a value for it, which is the 2345 as shown below.
When I check the page using "Inspect Element", the needed part looks like this:
<table class="table">
<tbody>
<tr class="even last">
<td>Holders</td>
<td id="address-token-holdersCount"
class="list-field">"2345"</td>
</tr>
</tbody>
</table>
What do you recommend to fix this issue?
As you yourself found out, the element is not present in the page source, and is loaded dynamically through an AJAX request. The urllib
module (or requests
) returns the page source, which is why you won't be able to get that value directly.
Go to Developer Tools
> Network
> XHR
and refresh the page. You'll see an AJAX request made to this url:
https://ethplorer.io/service/service.php?data=0x8b353021189375591723e7384262f45709a3c3dc
This url returns the data in the form of JSON. If you have a look at it, you can get the Holders
number from it using requests
module and the built-in .json()
method.
import requests
r = requests.get('https://ethplorer.io/service/service.php?data=0x8b353021189375591723e7384262f45709a3c3dc')
data = r.json()
holders = data['pager']['holders']['total']
print(holders)
# 2346