python django multithreading amazon-ec2 python-multithreading

Slow EC2 Performance with Python Threading?

I'm using Python threading in a REST endpoint so that the endpoint can launch a thread, and then immediately return a 200 OK to the client while the thread runs. (The client then polls server state to track the progress of the thread).

The code runs in 7 seconds on my local dev system, but takes 6 minutes on an AWS EC2 m5.large.

Here's what the code looks like:

    import threading
    [.....]

    # USES THREADING
    # https://stackoverflow.com/a/1239108/364966
    thr = threading.Thread(target=score, args=(myArgs1, myArgs2), kwargs={})
    thr.start() # Will run "foo"
    thr.is_alive() # Will return whether function is running currently

    data = {'now creating test scores'}
    return Response(data, status=status.HTTP_200_OK)

I turned off threading to test if that was the cause of the slowdown, like this:

    # USES THREADING
    # https://stackoverflow.com/a/1239108/364966
    # thr = threading.Thread(target=score, args=(myArgs1, myArgs2), kwargs={})
    # thr.start() # Will run "foo"
    # thr.is_alive() # Will return whether function is running currently

    # FOR DEBUGGING - SKIP THREADING TO SEE IF THAT'S WHAT'S SLOWING THINGS DOWN ON EC2
    score(myArgs1, myArgs2)

    data = {'now creating test scores'}
    return Response(data, status=status.HTTP_200_OK)

...and it ran in 5 seconds on EC2. This proves that something about how I'm handling threads on EC2 is the cause of the slowdown.

Is there something I need to configure on EC2 to better support Python threads?

Solution

An AWS-certified consultant has advised me that EC2 is known to be slow in execution of Python threads, and to use AWS Lambda functions instead.