I am following the https://github.com/awslabs/amazon-sagemaker-examples/tree/master/advanced_functionality/scikit_bring_your_own example for product recommendations.
I want to use the SVD from scikit-surprise library on Sagemaker.
from surprise import SVD
from surprise import Dataset
from surprise.model_selection import cross_validate
I added the scikit-surprise package in the Dockerfile, but i am getting the following errors:
# Build an image that can do training and inference in SageMaker
# This is a Python 2 image that uses the nginx, gunicorn, flask stack
# for serving inferences in a stable way.
FROM ubuntu:16.04
MAINTAINER Amazon AI <[email protected]>
RUN apt-get -y update && apt-get install -y --no-install-recommends \
wget \
python \
nginx \
ca-certificates \
&& rm -rf /var/lib/apt/lists/*
# Here we get all python packages.
# There's substantial overlap between scipy and numpy that we eliminate by
# linking them together. Likewise, pip leaves the install caches populated which uses
# a significant amount of space. These optimizations save a fair amount of space in the
# image, which reduces start up time.
RUN wget https://bootstrap.pypa.io/get-pip.py && python get-pip.py && \
pip install numpy==1.16.2 scipy==1.2.1 scikit-learn==0.20.2 pandas flask gevent gunicorn && \
(cd /usr/local/lib/python2.7/dist-packages/scipy/.libs; rm *; ln ../../numpy/.libs/* .) && \
rm -rf /root/.cache
RUN pip install scikit-surprise
# Set some environment variables. PYTHONUNBUFFERED keeps Python from buffering our standard
# output stream, which means that logs can be delivered to the user quickly. PYTHONDONTWRITEBYTECODE
# keeps Python from writing the .pyc files which are unnecessary in this case. We also update
# PATH so that the train and serve programs are found when the container is invoked.
ENV PYTHONUNBUFFERED=TRUE
ENV PYTHONDONTWRITEBYTECODE=TRUE
ENV PATH="/opt/program:${PATH}"
# Set up the program in the image
COPY products_recommender /opt/program
WORKDIR /opt/program
fullname:XXXXXXXXX.dkr.ecr.ap-southeast-1.amazonaws.com/products-recommender:latest
WARNING! Using --password via the CLI is insecure. Use --password-stdin.
Login Succeeded
Sending build context to Docker daemon 67.58kB
Step 1/10 : FROM ubuntu:16.04
---> 13c9f1285025
Step 2/10 : MAINTAINER Amazon AI <[email protected]>
---> Using cache
---> 44baf3286201
Step 3/10 : RUN apt-get -y update && apt-get install -y --no-install-recommends wget python nginx ca-certificates && rm -rf /var/lib/apt/lists/*
---> Using cache
---> 8983fa906515
Step 4/10 : RUN wget https://bootstrap.pypa.io/get-pip.py && python get-pip.py && pip install numpy==1.16.2 scipy==1.2.1 scikit-learn==0.20.2 pandas flask gevent gunicorn && (cd /usr/local/lib/python2.7/dist-packages/scipy/.libs; rm *; ln ../../numpy/.libs/* .) && rm -rf /root/.cache
---> Using cache
---> 9dbfedf02b57
Step 5/10 : RUN pip install scikit-surprise
---> Running in 82295cb0affe
DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7.
Collecting scikit-surprise
Downloading https://files.pythonhosted.org/packages/f5/da/b5700d96495fb4f092be497f02492768a3d96a3f4fa2ae7dea46d4081cfa/scikit-surprise-1.1.0.tar.gz (6.4MB)
Collecting joblib>=0.11 (from scikit-surprise)
Downloading https://files.pythonhosted.org/packages/28/5c/cf6a2b65a321c4a209efcdf64c2689efae2cb62661f8f6f4bb28547cf1bf/joblib-0.14.1-py2.py3-none-any.whl (294kB)
Requirement already satisfied: numpy>=1.11.2 in /usr/local/lib/python2.7/dist-packages (from scikit-surprise) (1.16.2)
Requirement already satisfied: scipy>=1.0.0 in /usr/local/lib/python2.7/dist-packages (from scikit-surprise) (1.2.1)
Requirement already satisfied: six>=1.10.0 in /usr/local/lib/python2.7/dist-packages (from scikit-surprise) (1.12.0)
Building wheels for collected packages: scikit-surprise
Building wheel for scikit-surprise (setup.py): started
Building wheel for scikit-surprise (setup.py): finished with status 'error'
ERROR: Complete output from command /usr/bin/python -u -c 'import setuptools, tokenize;__file__='"'"'/tmp/pip-install-VsuzGr/scikit-surprise/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-Bb1_iT --python-tag cp27:
ERROR: running bdist_wheel
running build
running build_py
creating build
creating build/lib.linux-x86_64-2.7
creating build/lib.linux-x86_64-2.7/surprise
copying surprise/trainset.py -> build/lib.linux-x86_64-2.7/surprise
copying surprise/dataset.py -> build/lib.linux-x86_64-2.7/surprise
copying surprise/__init__.py -> build/lib.linux-x86_64-2.7/surprise
copying surprise/__main__.py -> build/lib.linux-x86_64-2.7/surprise
copying surprise/reader.py -> build/lib.linux-x86_64-2.7/surprise
copying surprise/builtin_datasets.py -> build/lib.linux-x86_64-2.7/surprise
copying surprise/dump.py -> build/lib.linux-x86_64-2.7/surprise
copying surprise/utils.py -> build/lib.linux-x86_64-2.7/surprise
copying surprise/accuracy.py -> build/lib.linux-x86_64-2.7/surprise
creating build/lib.linux-x86_64-2.7/surprise/model_selection
copying surprise/model_selection/search.py -> build/lib.linux-x86_64-2.7/surprise/model_selection
copying surprise/model_selection/__init__.py -> build/lib.linux-x86_64-2.7/surprise/model_selection
copying surprise/model_selection/split.py -> build/lib.linux-x86_64-2.7/surprise/model_selection
copying surprise/model_selection/validation.py -> build/lib.linux-x86_64-2.7/surprise/model_selection
creating build/lib.linux-x86_64-2.7/surprise/prediction_algorithms
copying surprise/prediction_algorithms/algo_base.py -> build/lib.linux-x86_64-2.7/surprise/prediction_algorithms
copying surprise/prediction_algorithms/predictions.py -> build/lib.linux-x86_64-2.7/surprise/prediction_algorithms
copying surprise/prediction_algorithms/baseline_only.py -> build/lib.linux-x86_64-2.7/surprise/prediction_algorithms
copying surprise/prediction_algorithms/__init__.py -> build/lib.linux-x86_64-2.7/surprise/prediction_algorithms
copying surprise/prediction_algorithms/random_pred.py -> build/lib.linux-x86_64-2.7/surprise/prediction_algorithms
copying surprise/prediction_algorithms/knns.py -> build/lib.linux-x86_64-2.7/surprise/prediction_algorithms
running egg_info
writing requirements to scikit_surprise.egg-info/requires.txt
writing scikit_surprise.egg-info/PKG-INFO
writing top-level names to scikit_surprise.egg-info/top_level.txt
writing dependency_links to scikit_surprise.egg-info/dependency_links.txt
writing entry points to scikit_surprise.egg-info/entry_points.txt
reading manifest file 'scikit_surprise.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
writing manifest file 'scikit_surprise.egg-info/SOURCES.txt'
copying surprise/similarities.c -> build/lib.linux-x86_64-2.7/surprise
copying surprise/similarities.pyx -> build/lib.linux-x86_64-2.7/surprise
copying surprise/prediction_algorithms/co_clustering.c -> build/lib.linux-x86_64-2.7/surprise/prediction_algorithms
copying surprise/prediction_algorithms/co_clustering.pyx -> build/lib.linux-x86_64-2.7/surprise/prediction_algorithms
copying surprise/prediction_algorithms/matrix_factorization.c -> build/lib.linux-x86_64-2.7/surprise/prediction_algorithms
copying surprise/prediction_algorithms/matrix_factorization.pyx -> build/lib.linux-x86_64-2.7/surprise/prediction_algorithms
copying surprise/prediction_algorithms/optimize_baselines.c -> build/lib.linux-x86_64-2.7/surprise/prediction_algorithms
copying surprise/prediction_algorithms/optimize_baselines.pyx -> build/lib.linux-x86_64-2.7/surprise/prediction_algorithms
copying surprise/prediction_algorithms/slope_one.c -> build/lib.linux-x86_64-2.7/surprise/prediction_algorithms
copying surprise/prediction_algorithms/slope_one.pyx -> build/lib.linux-x86_64-2.7/surprise/prediction_algorithms
running build_ext
building 'surprise.similarities' extension
creating build/temp.linux-x86_64-2.7
creating build/temp.linux-x86_64-2.7/surprise
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fno-strict-aliasing -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -fPIC -I/usr/local/lib/python2.7/dist-packages/numpy/core/include -I/usr/include/python2.7 -c surprise/similarities.c -o build/temp.linux-x86_64-2.7/surprise/similarities.o
unable to execute 'x86_64-linux-gnu-gcc': No such file or directory
error: command 'x86_64-linux-gnu-gcc' failed with exit status 1
----------------------------------------
ERROR: Failed building wheel for scikit-surprise
Running setup.py clean for scikit-surprise
Failed to build scikit-surprise
Installing collected packages: joblib, scikit-surprise
Running setup.py install for scikit-surprise: started
Running setup.py install for scikit-surprise: finished with status 'error'
ERROR: Complete output from command /usr/bin/python -u -c 'import setuptools, tokenize;__file__='"'"'/tmp/pip-install-VsuzGr/scikit-surprise/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-rrsWf0/install-record.txt --single-version-externally-managed --compile:
ERROR: running install
running build
running build_py
creating build
creating build/lib.linux-x86_64-2.7
creating build/lib.linux-x86_64-2.7/surprise
copying surprise/trainset.py -> build/lib.linux-x86_64-2.7/surprise
copying surprise/dataset.py -> build/lib.linux-x86_64-2.7/surprise
copying surprise/__init__.py -> build/lib.linux-x86_64-2.7/surprise
copying surprise/__main__.py -> build/lib.linux-x86_64-2.7/surprise
copying surprise/reader.py -> build/lib.linux-x86_64-2.7/surprise
copying surprise/builtin_datasets.py -> build/lib.linux-x86_64-2.7/surprise
copying surprise/dump.py -> build/lib.linux-x86_64-2.7/surprise
copying surprise/utils.py -> build/lib.linux-x86_64-2.7/surprise
copying surprise/accuracy.py -> build/lib.linux-x86_64-2.7/surprise
creating build/lib.linux-x86_64-2.7/surprise/model_selection
copying surprise/model_selection/search.py -> build/lib.linux-x86_64-2.7/surprise/model_selection
copying surprise/model_selection/__init__.py -> build/lib.linux-x86_64-2.7/surprise/model_selection
copying surprise/model_selection/split.py -> build/lib.linux-x86_64-2.7/surprise/model_selection
copying surprise/model_selection/validation.py -> build/lib.linux-x86_64-2.7/surprise/model_selection
creating build/lib.linux-x86_64-2.7/surprise/prediction_algorithms
copying surprise/prediction_algorithms/algo_base.py -> build/lib.linux-x86_64-2.7/surprise/prediction_algorithms
copying surprise/prediction_algorithms/predictions.py -> build/lib.linux-x86_64-2.7/surprise/prediction_algorithms
copying surprise/prediction_algorithms/baseline_only.py -> build/lib.linux-x86_64-2.7/surprise/prediction_algorithms
copying surprise/prediction_algorithms/__init__.py -> build/lib.linux-x86_64-2.7/surprise/prediction_algorithms
copying surprise/prediction_algorithms/random_pred.py -> build/lib.linux-x86_64-2.7/surprise/prediction_algorithms
copying surprise/prediction_algorithms/knns.py -> build/lib.linux-x86_64-2.7/surprise/prediction_algorithms
running egg_info
writing requirements to scikit_surprise.egg-info/requires.txt
writing scikit_surprise.egg-info/PKG-INFO
writing top-level names to scikit_surprise.egg-info/top_level.txt
writing dependency_links to scikit_surprise.egg-info/dependency_links.txt
writing entry points to scikit_surprise.egg-info/entry_points.txt
reading manifest file 'scikit_surprise.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
writing manifest file 'scikit_surprise.egg-info/SOURCES.txt'
copying surprise/similarities.c -> build/lib.linux-x86_64-2.7/surprise
copying surprise/similarities.pyx -> build/lib.linux-x86_64-2.7/surprise
copying surprise/prediction_algorithms/co_clustering.c -> build/lib.linux-x86_64-2.7/surprise/prediction_algorithms
copying surprise/prediction_algorithms/co_clustering.pyx -> build/lib.linux-x86_64-2.7/surprise/prediction_algorithms
copying surprise/prediction_algorithms/matrix_factorization.c -> build/lib.linux-x86_64-2.7/surprise/prediction_algorithms
copying surprise/prediction_algorithms/matrix_factorization.pyx -> build/lib.linux-x86_64-2.7/surprise/prediction_algorithms
copying surprise/prediction_algorithms/optimize_baselines.c -> build/lib.linux-x86_64-2.7/surprise/prediction_algorithms
copying surprise/prediction_algorithms/optimize_baselines.pyx -> build/lib.linux-x86_64-2.7/surprise/prediction_algorithms
copying surprise/prediction_algorithms/slope_one.c -> build/lib.linux-x86_64-2.7/surprise/prediction_algorithms
copying surprise/prediction_algorithms/slope_one.pyx -> build/lib.linux-x86_64-2.7/surprise/prediction_algorithms
running build_ext
building 'surprise.similarities' extension
creating build/temp.linux-x86_64-2.7
creating build/temp.linux-x86_64-2.7/surprise
x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fno-strict-aliasing -Wdate-time -D_FORTIFY_SOURCE=2 -g -fstack-protector-strong -Wformat -Werror=format-security -fPIC -I/usr/local/lib/python2.7/dist-packages/numpy/core/include -I/usr/include/python2.7 -c surprise/similarities.c -o build/temp.linux-x86_64-2.7/surprise/similarities.o
unable to execute 'x86_64-linux-gnu-gcc': No such file or directory
error: command 'x86_64-linux-gnu-gcc' failed with exit status 1
----------------------------------------
ERROR: Command "/usr/bin/python -u -c 'import setuptools, tokenize;__file__='"'"'/tmp/pip-install-VsuzGr/scikit-surprise/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-rrsWf0/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-install-VsuzGr/scikit-surprise/
WARNING: You are using pip version 19.1.1, however version 19.3.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
The command '/bin/sh -c pip install scikit-surprise' returned a non-zero code: 1
The push refers to repository [XXXXXXXX.dkr.ecr.ap-southeast-1.amazonaws.com/products-recommender]
89c1adca7d35: Layer already exists
ddcb6879486f: Layer already exists
4a02efecad74: Layer already exists
92d3f22d44f3: Layer already exists
10e46f329a25: Layer already exists
24ab7de5faec: Layer already exists
1ea5a27b0484: Layer already exists
latest: digest: sha256:5ed35f1964d10f13bc8a05d379913c24195ea31ec848157016381fbd1bb12f28 size: 1782
The 'x86_64-linux-gnu-gcc' binary can't be found in environment where you're building the container. Make sure that gcc is installed, and that you use the right name (gcc?).