Looking to grab all the comments from a given video, rather than go one page at a time.
from gdata import youtube as yt
from gdata.youtube import service as yts
client = yts.YouTubeService()
client.ClientLogin(username, pwd) #the pwd might need to be application specific fyi
comments = client.GetYouTubeVideoComments(video_id='the_id')
a_comment = comments.entry[0]
The above code with let you grab a single comment, likely the most recent comment, but I'm looking for a way to grab all the comments at once. Is this possible with Python's gdata
module?
The Youtube API docs for comments, the comment feed docs and the Python API docs
The following achieves what you asked for using the Python YouTube API:
from gdata.youtube import service
USERNAME = 'username@gmail.com'
PASSWORD = 'a_very_long_password'
VIDEO_ID = 'wf_IIbT8HGk'
def comments_generator(client, video_id):
comment_feed = client.GetYouTubeVideoCommentFeed(video_id=video_id)
while comment_feed is not None:
for comment in comment_feed.entry:
yield comment
next_link = comment_feed.GetNextLink()
if next_link is None:
comment_feed = None
else:
comment_feed = client.GetYouTubeVideoCommentFeed(next_link.href)
client = service.YouTubeService()
client.ClientLogin(USERNAME, PASSWORD)
for comment in comments_generator(client, VIDEO_ID):
author_name = comment.author[0].name.text
text = comment.content.text
print("{}: {}".format(author_name, text))
Unfortunately the API limits the number of entries that can be retrieved to 1000. This was the error I got when I tried a tweaked version with a hand crafted GetYouTubeVideoCommentFeed
URL parameter:
gdata.service.RequestError: {'status': 400, 'body': 'You cannot request beyond item 1000.', 'reason': 'Bad Request'}
Note that the same principle should apply to retrieve entries in other feeds of the API.
If you want to hand craft the GetYouTubeVideoCommentFeed
URL parameter, its format is:
'https://gdata.youtube.com/feeds/api/videos/{video_id}/comments?start-index={start_index}&max-results={max_results}'
The following restrictions apply: start-index <= 1000
and max-results <= 50
.