I tried to extract the text content of comments from a web page using its URL link, and I used BeautifulSoup for scraping. The content of comments is visible on the page when I clicked the URL link, but the HTML object returned by BeautifulSoup did not contain these tags and texts.
I used BeautifulSoup with 'html.parser' to do the web scraping. I successfully extracted the number of likes/views/comments of the video in the given webpage, but the information of comment sections was not included in the HTML file. The browser I used was Chrome, and the system is Ubuntu 18.04.1 LTS.
This is the codes I used (in python):
from urllib.request import urlopen
from bs4 import BeautifulSoup
import os
webpage_link = "https://www.airvuz.com/video/Majestic-Beast-Nanuk?id=59b2a56141ab4823e61ea901"
try:
page = urlopen(webpage_link)
except urllib.error.HTTPError as err: # webpage cannot be found
print("ERROR! %s" %(webpage_link))
soup = BeautifulSoup(page, 'html.parser')
The expected result is the soup object contains all the content which is visible on the webpage especially the text content of comments (like "Not being there I enjoyed a lot seeing the life style of white bear. Thanks to the provider for such documentary." and "WOOOW... amazing..."); however, I could not find the corresponding nodes in the soup object. Any help would be appreciated!
The comments are generated by JavasSript via an ajax request. You can send the same request and get the comments from the json
response. You can find the request using the network tab in the inspect tool.
from urllib.request import urlopen
from bs4 import BeautifulSoup, Comment
import json
webpage_link = "https://www.airvuz.com/api/comments/video/59b2a56141ab4823e61ea901?page=1&limit=20"
page = urlopen(webpage_link).read()
comments_json=data = json.loads(page)
for comment_info in comments_json['data']:
print(comment_info['comment'].strip())
Output
Not being there I enjoyed a lot seeing the life style of white bear. Thanks to the provider for such documentary.
WOOOW... amazing...
I've been photographing polar bears for years, but to see this footage from a drones perspective was epic! Well done and congratz on the Nominee! Well deserved.
You are da man Florian!
Absolutely outstanding!
This is incredible
jaw dropping
This is wow amazing, love it.
So cool! Did the bears react to the drone at all?
Congratulations! It's awesome! I am watching in tears....
Awesome!
perfect video awesome
It is very, very beautiful !!! Sincere congratulations
Made my day, exquisite, thank you
Wow
Super!
Marvelous!
Man this is incredible!
Material is good, but edi is bad. This history about beer's family...
Muy bueno!