I'm extremely new to web scraping and I'm making a simple program with Python which uses string methods such as str.find()
.
Currently, I extract the HTML code of a webpage as a string via
from urllib.request import urlopen
html_str = urlopen(url).read().decode('utf-8')
However, I am confused as to why all of the code isn't returned. For example, a Youtube channel page displays the subscriber count with
<yt-formatted-string id="subscriber-count" class="style-scope ytd-c4-tabbed-header-renderer">106M subscribers</yt-formatted-string>
But this string does not appear in html_str
.
So, what's going wrong? Is there anything that I am doing or using incorrectly?
Some of the web scraping libraries do not fetch JavaScript code or values. One library that I do know that does fetch JavaScript code as well is "Selenium". But it comes at the cost that it will run seemingly slower than other scraping libraries.