Search code examples
mongodbazurepymongoedxopenedx

PyMongo AutoReconnect: timed out


I work in an Azure environment. I have a VM that runs a Django application (Open edX) and a Mongo server on another VM instance (Ubuntu 16.04). Whenever I try to load anything in the application (where the data is fetched from the Mongo server), I would get an error like this one:

Feb 23 12:49:43 xxxxx [service_variant=lms][mongodb_proxy][env:sandbox] ERROR [xxxxx  13875] [mongodb_proxy.py:55] - Attempt 0
Traceback (most recent call last):
  File "/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/mongodb_proxy.py", line 53, in wrapper
    return func(*args, **kwargs)
  File "/edx/app/edxapp/edx-platform/common/lib/xmodule/xmodule/contentstore/mongo.py", line 135, in find
    with self.fs.get(content_id) as fp:
  File "/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/gridfs/__init__.py", line 159, in get
    return GridOut(self.__collection, file_id)
  File "/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/gridfs/grid_file.py", line 406, in __init__
    self._ensure_file()
  File "/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/gridfs/grid_file.py", line 429, in _ensure_file
    self._file = self.__files.find_one({"_id": self.__file_id})
  File "/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/pymongo/collection.py", line 1084, in find_one
    for result in cursor.limit(-1):
  File "/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/pymongo/cursor.py", line 1149, in next
    if len(self.__data) or self._refresh():
  File "/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/pymongo/cursor.py", line 1081, in _refresh
    self.__codec_options.uuid_representation))
  File "/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/pymongo/cursor.py", line 996, in __send_message
    res = client._send_message_with_response(message, **kwargs)
  File "/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/pymongo/mongo_client.py", line 1366, in _send_message_with_response
    raise AutoReconnect(str(e))
AutoReconnect: timed out

First I thought it was because my Mongo server lived in an instance outside of the Django application's virtual network. I created a new Mongo server on an instance inside the same virtual network and would still get these issues. Mind you, I receive the data eventually but I feel like I wouldn't get timed out errors if the connection is normal.

If it helps, here's the Ansible playbook that I used to create the Mongo server: https://github.com/edx/configuration/tree/master/playbooks/roles/mongo_3_2

Also I have tailed the Mongo log file and this is the only line that would appear at the same time I would get the timed out error on the application server:

2018-02-23T12:49:20.890+0000 [conn5]  authenticate db: edxapp { authenticate: 1, user: "user", nonce: "xxx", key: "xxx" }

mongostat and mongotop don't show anything out of the ordinary. Also here's the htop output: enter image description here

I don't know what else to look for or how to fix this issue.


Solution

  • I forgot to change the Mongo server IP's in the Django application settings to point to the new private IP address inside the virtual network instead of the public IP. After I've changed that it don't get that issue anymore.

    If you are reading this, make sure you change the private IP to a static one in Azure, if you are using that IP address in the Djagno application settings.