I've built a machine learning api that uses Torch as the ML framework. When I upload the code to Googe App Engine it runs out of memory.
After some debugging I found out that the issue it the installation of Torch.
I'm using Torch 1.5.0 and python 3.7.4
So how do I fix this error? Maybe I can change something i app.yaml?
Error message:
Step #1 - "builder": OSError: [Errno 12] Cannot allocate memory
Step #1 - "builder": self.pid = os.fork()
Step #1 - "builder": File "/usr/lib/python2.7/subprocess.py", line 938, in _execute_child
Step #1 - "builder": errread, errwrite)
Step #1 - "builder": File "/usr/lib/python2.7/subprocess.py", line 394, in __init__
Step #1 - "builder": File "/usr/local/bin/ftl.par/__main__/ftl/python/layer_builder.py", line 346, in _python_version
Step #1 - "builder": File "/usr/local/bin/ftl.par/__main__/ftl/python/layer_builder.py", line 332, in GetCacheKeyRaw
Step #1 - "builder": File "/usr/local/bin/ftl.par/__main__/ftl/python/layer_builder.py", line 109, in GetCacheKeyRaw
Step #1 - "builder": File "/usr/local/bin/ftl.par/__main__/ftl/common/single_layer_image.py", line 60, in GetCacheKey
Step #1 - "builder": File "/usr/local/bin/ftl.par/__main__/ftl/python/layer_builder.py", line 153, in BuildLayer
Step #1 - "builder": File "/usr/local/bin/ftl.par/__main__/ftl/python/builder.py", line 114, in Build
Step #1 - "builder": File "/usr/local/bin/ftl.par/__main__.py", line 54, in main
Step #1 - "builder": File "/usr/local/bin/ftl.par/__main__.py", line 65, in <module>
Step #1 - "builder": exec code in run_globals
Step #1 - "builder": File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
Step #1 - "builder": "__main__", fname, loader, pkg_name)
Step #1 - "builder": File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
Step #1 - "builder": Traceback (most recent call last):
And again this error message didn't appear when i didn't include torch in my requirements.txt
to reproduce:
app.yaml
runtime: python37
resources:
memory_gb: 16
disk_size_gb: 10
requirements.txt
gunicorn==20.0.4
aniso8601==8.0.0
beautifulsoup4==4.9.0
boto3==1.13.3
botocore==1.16.3
bs4==0.0.1
certifi==2020.4.5.1
chardet==3.0.4
click==7.1.2
colorama==0.4.3
docutils==0.15.2
filelock==3.0.12
Flask==1.1.2
Flask-RESTful==0.3.8
googletrans==2.4.0
idna==2.9
itsdangerous==1.1.0
Jinja2==2.11.2
jmespath==0.9.5
joblib==0.14.1
MarkupSafe==1.1.1
numpy==1.18.4
protobuf==3.11.3
python-dateutil==2.8.1
pytz==2020.1
regex==2020.4.4
requests==2.23.0
s3transfer==0.3.3
sacremoses==0.0.43
sentencepiece==0.1.86
six==1.14.0
soupsieve==2.0
tokenizers==0.5.2
tqdm==4.46.0
transformers==2.8.0
urllib3==1.25.9
Werkzeug==1.0.1
main.py
import flask
from flask import Flask, request
from flask_restful import Api, Resource
app = Flask(__name__)
api = Api(app)
production = False
import json
# Import api code
# Create main api 'view'
class main_api(Resource):
def get(self):
question = request.args.get('question')
# Run the script
# But not necessary for the minimum working test
return {
'question': question,
# 'results': results_from_script,
}
# Adds resource
api.add_resource(main_api, '/')
# Starts the api
if __name__ == '__main__':
host = '127.0.0.1'
port = 8080
app.run(host=host, port=port, debug=not production)
I fixed this error by using the flex enviroment.
The only thing I had to change was the app.yaml
runtime: python
env: flex
entrypoint: gunicorn -b :$PORT main:app
runtime_config:
python_version: 3
manual_scaling:
instances: 1
resources:
cpu: 2
memory_gb: 5
disk_size_gb: 10
And then it was ready to be deployed