Handle pagination in Python when interracting with Azure Graph API

I am getting all the resource groups tags in my tenant using an Azure Graph query which works perfectly using the Azure graph explorer from the portal.

Here is the query:

resourcecontainers
| where type == 'microsoft.resources/subscriptions/resourcegroups'
| extend dates=format_datetime(now(), "yyyy-MM-dd")
| join kind=leftouter (
    resourcecontainers
    | where type == 'microsoft.resources/subscriptions'
    | project SubscriptionName=name, subscriptionId)
    on subscriptionId
| project SubscriptionName, subscriptionId, resourceGroup, 
    financial_contact=tags.financial_contact, security_contact=tags.security_contact

I am getting all the results in the portal (more than 2000 resource groups).

When I tried to do the same using my Python script, I got a page limit of 530 resources. Here is my script:

from azure.identity import DefaultAzureCredential
from azure.mgmt.resourcegraph import ResourceGraphClient
from azure.mgmt.resource import ResourceManagementClient
from azure.mgmt.resourcegraph.models import *
import json

# Initialize Azure credentials
credentials = DefaultAzureCredential()

# Initialize Resource Graph client
resource_graph_client = ResourceGraphClient(credentials)
skip = 0
result = []


query_code = f"""
resourcecontainers
| where type == 'microsoft.resources/subscriptions/resourcegroups'
| extend dates=format_datetime(now(), "yyyy-MM-dd")
| join kind=leftouter (
    resourcecontainers
    | where type == 'microsoft.resources/subscriptions'
    | project SubscriptionName=name, subscriptionId)
    on subscriptionId
| project SubscriptionName, subscriptionId, resourceGroup, 
   financial_contact=tags.financial_contact, security_contact=tags.security_contact,
    environment=tags.environment,
    version=tags.version, dates, type, location, id_prefix=id
"""


query = QueryRequest(
            query= query_code 
)
query_response = resource_graph_client.resources(query)
query_response_str = str(query_response)
json_data = json.dumps(query_response_str)

json_data = json.loads(json_data)



output_file = "resource_groups_tags.txt"
with open(output_file, "w") as f:
    json.dump(json_data, f, indent=4)

Here is the first part of the response:

{'additional_properties': {}, 'total_records': 530, 'count': 530, 'result_truncated': 'false', 'skip_token': None, 'data': [{'SubscriptionName': '

I really don't find how to handle pagination to get all the results as there is no skip/offset into the query. In Microsoft documentation they talk about the 'skip_token', but I did not find it really clear, in the response it is set to None.

Can someone help with this ?

I tried skip, limit... but the skip did not work with the limit so I don't see how to handle it.

Solution

I found the solution, I don't know why the result limit was to 530, it changed to 1000 and I am getting the skip_token value in the response.

Here is the code I use:

from azure.identity import DefaultAzureCredential
from azure.mgmt.resourcegraph import ResourceGraphClient
from azure.mgmt.resource import ResourceManagementClient
from azure.mgmt.resourcegraph.models import *
import json

def get_tags(tenant: str):
    # Initialize Azure credentials
    credentials = DefaultAzureCredential()

    # Initialize Resource Graph client
    resource_graph_client = ResourceGraphClient(credentials)
    results = []


    query_code = f"""
    resourcecontainers
    | where type == 'microsoft.resources/subscriptions/resourcegroups'
    | extend dates=format_datetime(now(), "yyyy-MM-dd")
    | join kind=leftouter (
        resourcecontainers
        | where type == 'microsoft.resources/subscriptions'
        | project SubscriptionName=name, subscriptionId)
        on subscriptionId
    | project SubscriptionName, subscriptionId, resourceGroup, 
       financial_contact=tags.financial_contact, security_contact=tags.security_contact,
       environment=tags.environment,
        version=tags.version, dates, type, location, id_prefix=id
    """
    

    skip_Token = None
    n = 0

    while True:

        query = QueryRequest(
                query = query_code,
                options = QueryRequestOptions(
                    skip_token= skip_Token
                )
            )
        query_response = resource_graph_client.resources(query)

        for tags in query_response.data:
            tags_params = {
                'environment': tags.get('environment'),
                'security_contact': tags.get('security_contact'),
                'subscription': tags.get('SubscriptionName'),
                'subscription_id': tags.get('subscriptionId'),
                'resource_group': tags.get('resourceGroup'),
                'tenant': tenant
            }
            
            results.append(tags_params)
        n +=1
        skip_Token = query_response.skip_token

        if not skip_Token:
            break
    print(n)

    return results