Search code examples
pythonhtmlurllib

Trouble with webscraping using string methods in Python


I'm extremely new to web scraping and I'm making a simple program with Python which uses string methods such as str.find().

Currently, I extract the HTML code of a webpage as a string via

from urllib.request import urlopen

html_str = urlopen(url).read().decode('utf-8')

However, I am confused as to why all of the code isn't returned. For example, a Youtube channel page displays the subscriber count with

<yt-formatted-string id="subscriber-count" class="style-scope ytd-c4-tabbed-header-renderer">106M subscribers</yt-formatted-string>

But this string does not appear in html_str.

So, what's going wrong? Is there anything that I am doing or using incorrectly?


Solution

  • Some of the web scraping libraries do not fetch JavaScript code or values. One library that I do know that does fetch JavaScript code as well is "Selenium". But it comes at the cost that it will run seemingly slower than other scraping libraries.