I have built a simple news aggregator site, in which the memory usage of all my App Engine instances keep growing until reaching the limit and therefore being shut down.
I have started to eliminate everything from my app to arrive at a minimal reproducible version. This is what I have now:
app = Flask(__name__)
datastore_client = datastore.Client()
@app.route('/')
def root():
query = datastore_client.query(kind='source')
query.order = ['list_sequence']
sources = query.fetch()
for source in sources:
pass
Stats show a typical saw-tooth pattern: at instance startup, it goes to 190 - 210 Mb, then upon some requests, but NOT ALL requests, memory usage increases by 20 - 30 Mb. (This, by the way, roughly corresponds to the estimated size of the query results, although I cannot be sure this is relevant info.) This keeps happening until it exceeds 512 Mb, when it is shut down. It usually happens at around the 50th - 100th request to "/". No other requests are made to anything else in the meantime.
Now, if I eliminate the "for" cycle, and only the query remains, the problem goes away, the memory usage remains at 190 Mb flat, no increase even after 100+ requests.
gc.collect() at the end does not help. I have also tried looking at the difference in tracemalloc stats at the beginning and end of the function, I have not found anything useful.
Has anyone experienced anything similar, please? Any ideas what might go wrong here? What additional tests / investigations can you recommend? Is this possibly a Google App Engine / Datastore issue I have no control of?
Thank you.
@Alex in the other answer did a pretty good research, so I will follow up with this recommendation: try using the NDB Library. All calls with this library have to be wrapped into a context manager, which should guarantee cleaning up after closing. That could help fix your problem:
ndb_client = ndb.Client(**init_client)
with ndb_client.context():
query = MyModel.query().order(MyModel.my_column)
sources = query.fetch()
for source in sources:
pass
# if you try to query DataStore outside the context manager, it will raise an error
query = MyModel.query().order(MyModel.my_column)