Providing a snippet below of some code I'm using that runs fine, but over the course of about 5 minutes, eventually starts throwing malloc()
type errors due to using too much memory on my laptop.
As you can see, I've even tried using forced gc.collect()
statements and explicitly del
eting variables immediately, to no avail.
Code:
The user_guids
and user_groups_list
objects are both basic lists with string data in each element. The update_user_groups{}
function makes an API call and logs the result to a database, then returns.
Edit: I have tried different values for WORKER_THREADS
and it eats memory no matter what. I'm currently using between 5 and 10 while testing. Lowering the max workers does slow down the issue, but eventually fails with same memory error.
Current code (working, but eats memory):
with concurrent.futures.ThreadPoolExecutor(max_workers=WORKER_THREADS) as executor:
for user in user_guids:
future = executor.submit(update_user_groups, user['Guid'], user_groups_list)
future.add_done_callback(print_future_result)
del future
gc.collect()
I've also tried it like this (eats memory as well):
with concurrent.futures.ThreadPoolExecutor(max_workers=WORKER_THREADS) as executor:
futures = [executor.submit(update_user_groups, user['Guid']) for user in user_guids]
concurrent.futures.wait(futures, return_when=concurrent.futures.ALL_COMPLETED)
I need it to wait until all futures are done as this runs in a Windows service and simply "loops" again once it's complete. Is there a way to use futures
like this without eating memory? Is there an alternative "multithreaded" approach that can perform the same work without the memory issues?
Not a lot of commentary on this one, so I'll answer my own question with a not-so-great "answer"... but it did work. Changing NOTHING with the code, I simply updated to the 64-bit version of Python and it started working without issue. Didn't even use up the same amount of memory as before. The "cap" I was getting at 2.5GB was a red flag to me that screamed 32 bit problem, so I switched to 64-bit, rebuilt my virtual environment, and re-ran the code. Works great now, and memory seems to be managed much better!