Search code examples
pythonazure-storageazure-eventhubaiohttpazure-sdk-python

Why is len(self) throwing a RecursionError in aiohttp? A.k.a. Strange behaviour of the azure event hubs libary


That's a weird one.

We have a RecursionError thrown by aiohttp:

File "/usr/local/lib/python3.8/site-packages/aiohttp/streams.py", line 169, in on_eof
    callback()
  File "/usr/local/lib/python3.8/site-packages/aiohttp/client_reqrep.py", line 943, in _response_eof
    self._connection.release()
  File "/usr/local/lib/python3.8/site-packages/aiohttp/connector.py", line 171, in release
    self._key, self._protocol, should_close=self._protocol.should_close
  File "/usr/local/lib/python3.8/site-packages/aiohttp/client_proto.py", line 53, in should_close
    or len(self) > 0

But we cannot identify any code that would be recursing in our main code.

Pipfile:

azure-eventhub-checkpointstoreblob-aio = "*"
azure-eventhub-checkpointstoreblob = "*"

Locked versions are:

azure-eventhub = "==5.5.0"
aiohttp = "==3.7.4.post0"
azure-eventhub-checkpointstoreblob = "==1.1.4"
azure-eventhub-checkpointstoreblob-aio = "==1.1.4"

The above error occurs in the coroutine that azure-eventhub creates when recieving. From the event hub libary we get a wrapper message for above error on loglevel warning:

EventProcessor instance '...' of eventhub 'usage-ingress' consumer group '$Default'. An error occurred while load-balancing and claiming ownership. The exception is RecursionError('maximum recursion depth exceeded'). Retrying after 11.889686314484639 seconds

Recursion error on such a method?? I already looked up the code bits on the stacktrace and I am baffled.

Details about the Infrastructure:

Checkpointing is done on an Azure Storage Datalake Gen2 with Hirarchical Filesystem. Libary built in request logging for the requests to storage show a 200 for the ownership and checkpoint requests but also the message

"No body was attached to the request"

Our service started to get slow (1-5 Events per hour throughput on the reciever with 100k Events coming into the Eventhub) without any code change or redeployment.

We already suspected eventhub service throttling but no throttled requests on the metrics. Also no throttling on the Data Lake. Except a really long access time on the checkpoints folder that is created on the filesystem by the eventhub libary. Access tier is hot.

I am completely clueless on why this is happening. I hope somebody has an idea. Cheers

EDIT: Related to this:

"level": "ERR", "message": "Exception in eof callback", "threadid": 140306312525568, "processid": 10, "channel": "aiohttp.internal", "exception": "RecursionError", "stacktrace": "  File \"/usr/local/lib/python3.8/site-packages/aiohttp/streams.py\", line 169, in on_eof\n    callback()\n  File \"/usr/local/lib/python3.8/site-packages/aiohttp/client_reqrep.py\", line 943, in _response_eof\n    self._connection.release()\n  File \"/usr/local/lib/python3.8/site-packages/aiohttp/connector.py\", line 167, in release\n    self._notify_release()\n"

EDIT#2: In response to Adam:

Jun 24, 2021 @ 01:16:21.628 Request URL: 'https://<storage>.blob.core.windows.net/<resGroup>?restype=REDACTED&comp=REDACTED&prefix=REDACTED&marker=REDACTED&include=REDACTED'   INF  -   -  azure.core.pipeline.policies.http_logging_policy
    Jun 24, 2021 @ 01:16:21.629 Request method: 'GET'   INF  -   -  azure.core.pipeline.policies.http_logging_policy
    Jun 24, 2021 @ 01:16:21.629     'Authorization': 'REDACTED' INF  -   -  azure.core.pipeline.policies.http_logging_policy
    Jun 24, 2021 @ 01:16:21.629 No body was attached to the request INF  -   -  azure.core.pipeline.policies.http_logging_policy
    Jun 24, 2021 @ 01:16:21.629 Request headers:    INF  -   -  azure.core.pipeline.policies.http_logging_policy
    Jun 24, 2021 @ 01:16:21.629     'x-ms-version': 'REDACTED'  INF  -   -  azure.core.pipeline.policies.http_logging_policy
    Jun 24, 2021 @ 01:16:21.629     'Accept': 'application/xml' INF  -   -  azure.core.pipeline.policies.http_logging_policy
    Jun 24, 2021 @ 01:16:21.629     'x-ms-date': 'REDACTED' INF  -   -  azure.core.pipeline.policies.http_logging_policy
    Jun 24, 2021 @ 01:16:21.629     'x-ms-client-request-id': '05acf018-d479-11eb-af7d-a2f51e4cb2c7'    INF  -   -  azure.core.pipeline.policies.http_logging_policy
    Jun 24, 2021 @ 01:16:21.629     'User-Agent': 'azsdk-python-storage-blob/12.7.1 Python/3.8.10 (Linux-5.4.0-1047-azure-x86_64-with-glibc2.28)'   INF  -   -  azure.core.pipeline.policies.http_logging_policy
    Jun 24, 2021 @ 01:16:24.924 Response status: 200    INF  -   -  azure.core.pipeline.policies.http_logging_policy
    Jun 24, 2021 @ 01:16:24.924     'Transfer-Encoding': 'chunked'  INF  -   -  azure.core.pipeline.policies.http_logging_policy
    Jun 24, 2021 @ 01:16:24.924 Exception in eof callback   ERR   File "/usr/local/lib/python3.8/site-packages/aiohttp/streams.py", line 169, in on_eof
    callback()
  File "/usr/local/lib/python3.8/site-packages/aiohttp/client_reqrep.py", line 943, in _response_eof
    self._connection.release()
  File "/usr/local/lib/python3.8/site-packages/aiohttp/connector.py", line 171, in release
    self._key, self._protocol, should_close=self._protocol.should_close
  File "/usr/local/lib/python3.8/site-packages/aiohttp/client_proto.py", line 53, in should_close
    or len(self) > 0
    RecursionError  aiohttp.internal
    Jun 24, 2021 @ 01:16:24.924 Response headers:   INF  -   -  azure.core.pipeline.policies.http_logging_policy
    Jun 24, 2021 @ 01:16:24.925     'x-ms-version': 'REDACTED'  INF  -   -  azure.core.pipeline.policies.http_logging_policy
    Jun 24, 2021 @ 01:16:24.925     'Date': 'Wed, 23 Jun 2021 23:16:24 GMT' INF  -   -  azure.core.pipeline.policies.http_logging_policy
    Jun 24, 2021 @ 01:16:24.925     'Content-Type': 'application/xml'   INF  -   -  azure.core.pipeline.policies.http_logging_policy
    Jun 24, 2021 @ 01:16:24.925     'Server': 'Windows-Azure-Blob/1.0 Microsoft-HTTPAPI/2.0'    INF  -   -  azure.core.pipeline.policies.http_logging_policy
    Jun 24, 2021 @ 01:16:24.925     'x-ms-request-id': '57d01bb6-e01e-0019-7185-68c5ca000000'   INF  -   -  azure.core.pipeline.policies.http_logging_policy
    Jun 24, 2021 @ 01:16:24.925     'x-ms-client-request-id': 'f789fce2-d478-11eb-80e0-a2f51e4cb2c7'

Seems like the error is caused when hitting some kind of eof in the request phase.


Solution

  • So my problem was resolved through increasing the partition count on the Event Hubs and the manually deleting the ownership blob for that Event Hub.

    Still pretty strange, but I could not come up with a clear repro of this error.