Search code examples
pythonhtmlbeautifulsoupgoogle-colaboratoryreal-time-data

Why does not BeatifulSoup work well with the website Coinmarketcap?


I have a problem with the following code. When I run it in Google Colaboratory, I expect the result (a numerical data from the website Coinmarketcap) to be changing over time because it is changing continuously on the website, but I always get a fixed result. How can I fix the problem?

Your help would be highly appreciated:)

from bs4 import BeautifulSoup
import requests
while True:
     url="https://coinmarketcap.com/currencies/bitcoin/"
     html_content = requests.get(url).text

     soup = BeautifulSoup(html_content, "lxml")
     h = soup.find(class_='statsValue___2iaoZ').text.replace('$', '').replace('%','')
     print(f'\r{h}', end=" ")

880,648,583,648 (not changing):(


Solution

  • The values update dynamically through javascript. BeautifulSoup can't process that. You can use Selenium instead:

    !apt update
    !apt install chromium-chromedriver
    !pip install selenium
    

    Then:

    from selenium import webdriver
    from bs4 import BeautifulSoup
    
    options = webdriver.ChromeOptions()
    options.add_argument('--headless')
    options.add_argument('--no-sandbox')
    options.add_argument('--disable-dev-shm-usage')
    
    url="https://coinmarketcap.com/currencies/bitcoin/"
    wd = webdriver.Chrome('chromedriver',options=options)
    wd.get(url)
    
    while True:
         soup = BeautifulSoup(wd.page_source, "lxml")
         h = soup.find(class_='statsValue___2iaoZ').text.replace('$', '').replace('%','')
         print(f'\r{h}', end=" ")
    

    The values will now update as desired.