Search code examples
pythondeploymentpytorchaws-app-runner

AWS Apprunner Failing Health Check when pytorch is involved. But container works well on local


I have been experimenting with AWS app runner. I found a basic tutorial code that uses flask. Here is the code :

from flask import render_template
from flask import Flask

app = Flask(__name__)

@app.route('/')
def home():
    return render_template('index.html')

@app.route('/app')
def blog():
    return "Hello, from App!"



if __name__ == '__main__':
    app.run(threaded=True,host='0.0.0.0',port=80)

and here is the docker file

FROM python:3.7-slim

COPY ./requirements.txt /app/requirements.txt

WORKDIR /app

RUN pip install -r requirements.txt

COPY . /app

EXPOSE 80

ENTRYPOINT [ "python" ]

CMD [ "app.py" ]

I easily managed to deploy this setup on Apprunner. However, when I tried to deploy my app it was throwing me an error related to a health check. But my local container worked fine without errors. So it means there compatibility issue with apprunner.

09-19-2023 08:10:37 PM [AppRunner] Deployment with ID : da2bb9----- failed. Failure reason : Health check failed. 09-19-2023 08:10:25 PM [AppRunner] Health check failed on port '80'. Check your configured port number. For more information, read the application logs. 09-19-2023 08:04:14 PM [AppRunner] Performing health check on port '80'. 09-19-2023 08:04:04 PM [AppRunner] Provisioning instances and deploying image for publicly accessible service. 09-19-2023 08:03:53 PM [AppRunner] Successfully copied the image from ECR. 09-19-2023 07:52:50 PM [AppRunner] Deployment Artifact :- Repo Type: ECR; Image URL : 218512261774.dkr.ecr.us-west-2.amazonaws.com/test7; Image Tag : new 09-19-2023 07:52:50 PM [AppRunner] Deployment with ID : da2bb9532------- started. Triggering event: SERVICE_CREATE

I tried to pinpoint the reason and in the end, I ended up deploying 7 different images with various code configurations. In the end, I ended up creating minimal replication of the issue by adding torch=2.0.1 dependency (i was experimenting with python 3.10, but everything else was same ) to the requirements.txt file

I also tried with python3.10. passes health check without torch, fails with torch installed. I also experimented with different ports.

My question is what might be the cause of this issue and how may I fix it. I must say I am quite new to containers and AWS deployment. but I have done my research and couldnt find any solution


Solution

  • It works without a problem if you use cpu version of Pytorch. Add this line to dockerfile just after the requirements.txt. (And if you have other dependencies which are using torch, also add them after this line so they will be installed based on this version of the torch. )

    RUN pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu

    (add all of it. https://download.pytorch.org/whl/cpu link included)