I'm using Python threading in a REST endpoint so that the endpoint can launch a thread, and then immediately return a 200 OK to the client while the thread runs. (The client then polls server state to track the progress of the thread).
The code runs in 7 seconds on my local dev system, but takes 6 minutes on an AWS EC2 m5.large.
Here's what the code looks like:
import threading
[.....]
# USES THREADING
# https://stackoverflow.com/a/1239108/364966
thr = threading.Thread(target=score, args=(myArgs1, myArgs2), kwargs={})
thr.start() # Will run "foo"
thr.is_alive() # Will return whether function is running currently
data = {'now creating test scores'}
return Response(data, status=status.HTTP_200_OK)
I turned off threading to test if that was the cause of the slowdown, like this:
# USES THREADING
# https://stackoverflow.com/a/1239108/364966
# thr = threading.Thread(target=score, args=(myArgs1, myArgs2), kwargs={})
# thr.start() # Will run "foo"
# thr.is_alive() # Will return whether function is running currently
# FOR DEBUGGING - SKIP THREADING TO SEE IF THAT'S WHAT'S SLOWING THINGS DOWN ON EC2
score(myArgs1, myArgs2)
data = {'now creating test scores'}
return Response(data, status=status.HTTP_200_OK)
...and it ran in 5 seconds on EC2. This proves that something about how I'm handling threads on EC2 is the cause of the slowdown.
Is there something I need to configure on EC2 to better support Python threads?
An AWS-certified consultant has advised me that EC2 is known to be slow in execution of Python threads, and to use AWS Lambda functions instead.