Search code examples
pythondockerdatabricksmlflow

MLflow load model fails Python


I am trying to build an API using an MLflow model.

the funny thing is it works from one location on my PC and not from another. So, the reason for doing I wanted to change my repo etc.

So, the simple code of

from mlflow.pyfunc import load_model
MODEL_ARTIFACT_PATH = "./model/model_name/"
MODEL = load_model(MODEL_ARTIFACT_PATH)

now fails with

ERROR:    Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/starlette/routing.py", line 540, in lifespan
    async for item in self.lifespan_context(app):
  File "/usr/local/lib/python3.8/dist-packages/starlette/routing.py", line 481, in default_lifespan
    await self.startup()
  File "/usr/local/lib/python3.8/dist-packages/starlette/routing.py", line 516, in startup
    await handler()
  File "/code/./app/main.py", line 32, in startup_load_model
    MODEL = load_model(MODEL_ARTIFACT_PATH)
  File "/usr/local/lib/python3.8/dist-packages/mlflow/pyfunc/__init__.py", line 733, in load_model
    model_impl = importlib.import_module(conf[MAIN])._load_pyfunc(data_path)
  File "/usr/local/lib/python3.8/dist-packages/mlflow/spark.py", line 737, in _load_pyfunc
    return _PyFuncModelWrapper(spark, _load_model(model_uri=path))
  File "/usr/local/lib/python3.8/dist-packages/mlflow/spark.py", line 656, in _load_model
    return PipelineModel.load(model_uri)
  File "/usr/local/lib/python3.8/dist-packages/pyspark/ml/util.py", line 332, in load
    return cls.read().load(path)
  File "/usr/local/lib/python3.8/dist-packages/pyspark/ml/pipeline.py", line 258, in load
    return JavaMLReader(self.cls).load(path)
  File "/usr/local/lib/python3.8/dist-packages/pyspark/ml/util.py", line 282, in load
    java_obj = self._jread.load(path)
  File "/usr/local/lib/python3.8/dist-packages/py4j/java_gateway.py", line 1321, in __call__
    return_value = get_return_value(
  File "/usr/local/lib/python3.8/dist-packages/pyspark/sql/utils.py", line 117, in deco
    raise converted from None
pyspark.sql.utils.AnalysisException: Unable to infer schema for Parquet. It must be specified manually.

The model artifacts are already downloaded to the folder /model folder which has the following structure.

enter image description here

the load model call is in the main.py file As I mentioned it works from another directory, but there is no reference to any absolute paths. Also, I have made sure that my package references are identical. e,g I have pinned them all down

# Model
mlflow==1.25.1
protobuf==3.20.1
pyspark==3.2.1
scipy==1.6.2
six==1.15.0

also, the same docker file is used both places, which among other things, makes sure that the final resulting folder structure is the same

......other stuffs

COPY ./app /code/app
COPY ./model /code/model

what can explain it throwing this exception whereas in another location (on my PC), it works (same model artifacts) ?

Since it uses load_model function, it should be able to read the parquet files ?

Any question and I can explain.

EDIT1: I have debugged this a little more in the docker container and it seems the parquet files in the itemFactors folder (listed in my screenshot above) are not getting copied over to my image , even though I have the copy command to copy all files under the model folder. It is copying the _started , _committed and _SUCCESS files, just not the parquet files. Anyone knows why would that be? I DO NOT have a .dockerignore file. Why are those files ignored while copying?


Solution

  • I found the problem. Like I wrote in the EDIT1 of my post, with further observations, the parquet files were missing in the docker container. That was strange because I was copying the entire folder in my Dockerfile.

    I then realized that I was hitting this problem mentioned here. File paths exceeding 260 characters, silently fail and do not get copied over to the docker container. This was really frustrating because nothing failed during build and then during run, it gave me that cryptic error of "unable to infer schema for parquet", essentially because the parquet files were not copied over during docker build.