Is there a way how can I improve my script to stop returning me data from the same channel? For example, I want my code to skip parsing video data from channels that were actually scraped yet and get me only one data result per one channel, not more.
youtube = build('youtube','v3',developerKey = api_key)
print(type(youtube))
pp = PrettyPrinter()
nextPageToken = ''
for x in range(1):
#while True:
request = youtube.search().list(
q='I stand with Ukraine',
part='id,snippet',
maxResults=5,
order="viewCount",
pageToken=nextPageToken,
type='video')
print(type(request))
res = request.execute()
pp.pprint(res)
if 'nextPageToken' in res:
nextPageToken = res['nextPageToken']
To filter items
to consider channels only once, just keep track of considered channels with a Python set
and reconstruct items
by filtering with this set
.
To do so add before your loops:
channelIds = set()
And before the nextPageToken
management, add:
items = []
for item in res['items']:
channelId = item['snippet']['channelId']
if not channelId in channelIds:
channelIds.add(channelId)
items += [item]
res['items'] = items