Search code examples
pythongoogle-apiyoutube-apiyoutube-data-apigoogle-api-client

How does Pagination work in YouTube API (apiclient.discovery)?


I'm pretty new to working with the YouTube API and I've looking for a way to collect a bunch of channel data. However, the API is limited to 50 results per request. To get more results it allows you to use pagination. When I query a result I get the following token:

'nextPageToken': 'CDIQAA'

This token can be used to query the next set of results. So, it allows me to go to page 2 and get new results there. However, this token value changes when I get to page 2. This has lead to the following question:

How do I use the page token/pagination to get all the results possible?

I'm aware that this query will give a lot of results and that I need to filter more ;)

from apiclient.discovery import build


api_key = "My_key"

youtube = build('youtube','v3',developerKey = api_key)
print(type(youtube))

request = youtube.search().list(
    q='Fishing',
    part='snippet',
    type='channel',
    maxResults=50
)
print(type(request))
res = request.execute()
print(res)

for item in res['items']:
    print(item['snippet']['title'])


Solution

  • I believe your goal is as follows.

    • You want to retrieve the data from youtube.search().list(q = 'A query', part = 'id,snippet', type = 'video', maxResults = 50, relevanceLanguage = 'en', videoDuration = 'long') using pageToken.

    In this case, how about the following modification?

    Modified script:

    from apiclient.discovery import build
    
    
    api_key = "My_key"
    
    youtube = build('youtube', 'v3', developerKey=api_key)
    
    data = []
    pageToken = ""
    while True:
        res = youtube.search().list(
            q='Fishing',
            part='snippet',
            type='channel',
            maxResults=50,
            pageToken=pageToken if pageToken != "" else ""
        ).execute()
        v = res.get('items', [])
        if v:
            data.extend(v)
        pageToken = res.get('nextPageToken')
        if not pageToken:
            break
    
    # print(len(data)) # You can check the number of retrieved data.
    
    for item in data:
        print(item['snippet']['title'])
    

    Reference: