Search code examples
google-cloud-platformscikit-learngcp-ai-platform-notebook

How do you override Google AI platform's standard library's (i.e upgrade scikit-learn) and install other libraries for custom prediction routines?


I'm currently building a pipeline and trying to see if I can get an ML model deployed in AI platform's prediction service, then use it later on in other projects via the HTTP request that the prediction service offers.

However the model that is being used was built using an scikit-learn library that is of a higher version than offered for the prediction runtime version 1.15 (this is the current version supported by google for scikit-learn predictions). This runtime version only supports scikit-learn version 0.20.4 and my model requires 0.23.1. As far as I know, everything else in the custom prediction routine works as intended, but the error received when loading the model () is only ever encountered when the scikit-learn version is older than the model needs.

So, I need a way to force the prediction routine to use a particular version of scikit-learn via a pip install or some equivalent - in the past I have done this in Google Dataflow via custom installs in the setup.py file but have yet to succeed achieving this in AI platform custom prediction routines. I assume it can be done?

non-working 'setup.py'

from setuptools import setup
from setuptools import find_packages

REQUIRED_PACKAGES = ['scikit-learn>=0.23.1',
                 'mlxtend>=0.17.2']

setup(
    name='my_custom_code',
    version='0.1',
    install_requires=REQUIRED_PACKAGES,
    packages=find_packages(),
    include_package_data=True,
    scripts=['predictor.py']
)

Solution

  • So it turns out google currently does not support this capability. There is a closed alpha at this stage for AI Platform Prediction Custom Containers Alpha - but for the time being I've achieved the same result using Dataflow with a setup.py file using custom pip install commands.