429 Client Error: TooManyRequests for url

I have a script that executes an ingestion statement with a certain period. In simplified way it looks as follows:

import time
from azure.kusto.data import KustoClient, KustoConnectionStringBuilder, ClientRequestProperties


cluster = "https://<adxname>.centralus.kusto.windows.net"
client_id = "<sp_guid>"
client_secret = "<sp_secret>"
authority_id = "<tenant_guid>"
db = "db-name"

kcsb = KustoConnectionStringBuilder.with_aad_application_key_authentication(
    cluster, client_id, client_secret, authority_id)
client = KustoClient(kcsb)

query = """
.append my_table <|
another_table | where ... | summarize ... | project ...
"""

while True:
    client.execute(db, query)
    time.sleep(30.0)

So it executes a small query every 30 seconds. The query takes only milliseconds to complete. Lib version: azure-kusto-data==3.1.0.

It works fine for a while, but after some time it starts failing with this error:

requests.exceptions.HTTPError: 429 Client Error: TooManyRequests for url: https://adxname.centralus.kusto.windows.net/v1/rest/mgmt

azure.kusto.data.exceptions.KustoApiError: The control command was aborted due to throttling. Retrying after some backoff might succeed. CommandType: 'TableAppend', Capacity: 1, Origin: 'CapacityPolicy/Ingestion'.

Looking at the CapacityPolicy/Ingestion mentioned in the error, I cannot see how it can be relevant. This policy left as default:

.show cluster policy capacity

"Policy": {
  "IngestionCapacity": {
    "ClusterMaximumConcurrentOperations": 512,
    "CoreUtilizationCoefficient": 0.75
  },
  ...
}

I do not quite understand how it can be related to concurrent operations or core utilization as ingestion is fast and rarely executed.

How to troubleshoot the issue?

Solution

According to the error message, the ingestion capacity for your cluster is 1. This likely indicates you're using the dev SKU that has a single node with 2 cores.

With such a setup, only a single ingestion operation can run at a given time. Any additional concurrent ingestions will be throttled.

You can either implement tighter control over the client(s) ingesting into the cluster, so that no more than a single ingestion command attempts to run concurrently, and the calling code can recover from throttling errors; or scale the cluster up/out - by adding more nodes/cores you'll be increasing the ingestion capacity.

You can also verify who/what else is ingesting into your cluster by using .show commands