Search code examples
pythongevent

Gevent throws exception from wrong thread (greenlet)


Gevent appears to be raising networking-related errors in greenlets that are not currently accessing the network. Here's an example traceback with some details elided for brevity:

Traceback (most recent call last):
  ...
  File ".../asyncforms.py", line 95, in _try_to_process_form
    self.pool.spawn(self._async_migrate_form, wrapped_form, case_ids)
  File ".../lib/python3.6/site-packages/gevent/pool.py", line 391, in spawn
    self.start(greenlet)
  File ".../lib/python3.6/site-packages/gevent/pool.py", line 601, in start
    self.add(greenlet, *args, **kwargs)
  File ".../lib/python3.6/site-packages/gevent/pool.py", line 634, in add
    if not self._semaphore.acquire(blocking=blocking, timeout=timeout):
  File "src/gevent/_semaphore.py", line 100, in gevent.__semaphore.Semaphore.acquire
  File "src/gevent/_semaphore.py", line 128, in gevent.__semaphore.Semaphore.acquire
  File "src/gevent/_abstract_linkable.py", line 192, in gevent.__abstract_linkable.AbstractLinkable._wait
  File "src/gevent/_abstract_linkable.py", line 165, in gevent.__abstract_linkable.AbstractLinkable._wait_core
  File "src/gevent/_abstract_linkable.py", line 169, in gevent.__abstract_linkable.AbstractLinkable._wait_core
  File "src/gevent/_greenlet_primitives.py", line 60, in gevent.__greenlet_primitives.SwitchOutGreenletWithLoop.switch
  File "src/gevent/_greenlet_primitives.py", line 60, in gevent.__greenlet_primitives.SwitchOutGreenletWithLoop.switch
  File "src/gevent/_greenlet_primitives.py", line 64, in gevent.__greenlet_primitives.SwitchOutGreenletWithLoop.switch
  File "src/gevent/__greenlet_primitives.pxd", line 35, in gevent.__greenlet_primitives._greenlet_switch
socket.gaierror: [Errno -9] Address family for hostname not supported

As you can see the current thread is spawning a new greenlet in a gevent.pool.Pool, which to my knowledge should not hit the network. By that I mean the act of spawning a greenlet should not hit the network, although the function that will eventually run in that greenlet may hit the network, but I think I do not need to be concerned about that here since there is no evidence of it in the traceback.

Why is gevent raising this seemingly unrelated network error at this place in the code? I suspect the error is coming from another greenlet that is accessing the network. Is there a way to get the real traceback context for this error?

Not sure if any of the following are relevant, but for context:

On startup the process patches some things. This is a bit simplified (actual code here):

from gevent import monkey
from psycogreen.gevent import patch_psycopg

monkey.patch_all(subprocess=True)
patch_psycopg()

Edit: a bit later it sets gevent.get_hub().SYSTEM_ERROR = BaseException to make the program exit immediately if any greenlet crashes. Maybe this has other unexpected side-effects such as these confusing tracebacks?

System and library versions:

  • Linux djangomanage1-production 4.15.0-1041-aws #43-Ubuntu SMP Thu Jun 6 13:39:11 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
  • Python 3.6.8
  • gevent 1.4.0
  • greenlet 0.4.15

Solution

  • Indeed, it does appear that

    gevent.get_hub().SYSTEM_ERROR = BaseException
    

    has unexpected side effects. socket.gaierror, which was intermittent but fairly frequent, has not occurred since I commented out that line.