Recently I found a possible http connection leak issue in my code. I received "Connection pool is full, discarding connection" message in my log but there is few concurrent request in my code.
Actually I'm creating a py2neo.Graph
instance every time it enters a API function, and I didn't make a close behavior when it leave the function.
Because there is no close()
method for py2neo.Graph
instance and there is no statement about the life cycle of py2neo.Graph
instance in official documentation so I used to think that the instance would dead and automatically release its resources (like http connections in the pool) when it is no longer referenced by programmer's code (for python will execute deletion when the reference count of one object reaches zero).
So what is the actual behavior of the instance when its reference is released and what's the correct style of managing py2neo.Graph instance?
Because there is no close() method for py2neo.Graph instance and there is no statement about the life cycle of py2neo.Graph instance in official documentation so I used to think that the instance would dead and automatically release its resources (like http connections in the pool) when it is no longer referenced by programmer's code (for python will execute deletion when the reference count of one object reaches zero).
You're right that Python executes deletion when the variable is no longer referenced, but that's not something you should rely on, as there's no guarantee that it will be actually called.
For network connections, you want to explicitly close them, usually using a context manager or calling a close()
method. urllib3 provides ways to do this, but they have to be used at the py2neo
level.
So Database
objects contain a Connector
of type HTTPConnector
if you specified an HTTP URL, and you can access it through the connector
property and then close()
it. And if you have a Graph
, you can access the database
from it. So properly closing py2neo
graphs looks like this:
graph = Graph(...)
# use the graph...
graph.database.connector.close() # close the connection pool
But I don'tpy2neo
is intended to be used like that: they actually have a class-level Database cache, so that when you open multiple Graph()
instances (which is what you're doing), then the same database will get reused, and ultimately the same connection pool.
This is why you can simply call the Database.forget_all()
class method to empty the cache and close all your connections.
I'm not sure if this would work in your case and anyway it's generally faster to reuse existing connections and urllib3 is well-equipped to do that, at one condition: you need to make sure the connections are released into the pool. If you don't do that, then you quickly hit the connection limit, which means that to create a new connection requires discarding old ones, and that produces your "Connection pool is full, discarding connection"
warning.
But to be honest here it looks like py2neo
does not use streaming, so all connections should immediately put back into the pool, and you should not get the warning unless you have more than 40 concurrent requests, the current default. Maybe if you share your actual code in the form of a minimal example it will be easier to help.