Search code examples
azuretensorflowazureml-python-sdkabsl-py

Azure deployment error "TypeError: metaclass conflict" when importing tensorflow (1.13.1) in project code


Any help fixing the problem would be greatly appreciated.

I am trying to deploy an old CNN model (MRCNN), last deployed in December 2020, on Azure ML studio using the Python SDK V2. After a successful deployment, when the image is built in the deployment, I run into this error:

File "/azureml-envs/azureml_27d1bdb1f51e3c046e82b0c2ef18ac11/lib/python3.6/site-packages/azureml_inference_server_http/server/user_script.py", line 73, in load_script
    main_module_spec.loader.exec_module(user_module)
  File "<frozen importlib._bootstrap_external>", line 678, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/var/azureml-app/ProcessImage/__azureml_entry__.py", line 14, in <module>
    from samples.coco.inference import loadFlowchart, loadBarchart, loadFramework, detect_sketch
  File "/var/azureml-app/ProcessImage/samples/coco/inference.py", line 11, in <module>
    import tensorflow as tf
  File "/azureml-envs/azureml_27d1bdb1f51e3c046e82b0c2ef18ac11/lib/python3.6/site-packages/tensorflow/__init__.py", line 24, in <module>
    from tensorflow.python import pywrap_tensorflow  # pylint: disable=unused-import
  File "/azureml-envs/azureml_27d1bdb1f51e3c046e82b0c2ef18ac11/lib/python3.6/site-packages/tensorflow/python/__init__.py", line 70, in <module>
    from tensorflow.python.framework.framework_lib import *  # pylint: disable=redefined-builtin
  File "/azureml-envs/azureml_27d1bdb1f51e3c046e82b0c2ef18ac11/lib/python3.6/site-packages/tensorflow/python/framework/framework_lib.py", line 25, in <module>
    from tensorflow.python.framework.ops import Graph
  File "/azureml-envs/azureml_27d1bdb1f51e3c046e82b0c2ef18ac11/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 54, in <module>
    from tensorflow.python.platform import app
  File "/azureml-envs/azureml_27d1bdb1f51e3c046e82b0c2ef18ac11/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 24, in <module>
    from tensorflow.python.platform import flags
  File "/azureml-envs/azureml_27d1bdb1f51e3c046e82b0c2ef18ac11/lib/python3.6/site-packages/tensorflow/python/platform/flags.py", line 25, in <module>
    from absl.flags import *  # pylint: disable=wildcard-import
  File "/azureml-envs/azureml_27d1bdb1f51e3c046e82b0c2ef18ac11/lib/python3.6/site-packages/absl/flags/__init__.py", line 35, in <module>
    from absl.flags import _argument_parser
  File "/azureml-envs/azureml_27d1bdb1f51e3c046e82b0c2ef18ac11/lib/python3.6/site-packages/absl/flags/_argument_parser.py", line 82, in <module>
    class ArgumentParser(Generic[_T], metaclass=_ArgumentParserCache):
TypeError: metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/azureml-envs/azureml_27d1bdb1f51e3c046e82b0c2ef18ac11/lib/python3.6/site-packages/azureml_inference_server_http/server/aml_blueprint.py", line 130, in setup
    self.user_script.load_script(AML_APP_ROOT)
  File "/azureml-envs/azureml_27d1bdb1f51e3c046e82b0c2ef18ac11/lib/python3.6/site-packages/azureml_inference_server_http/server/user_script.py", line 75, in load_script
    raise UserScriptImportException(ex) from ex
azureml_inference_server_http.server.user_script.UserScriptImportException: Failed to import user script because it raised an unhandled exception

Based on my research, it could be due to a deprecated library. I am having a hard time identifying it and I wanted to ask here to see if there is a smarter/better way than manually going through each of the libraries.

Here is my conda environment details which I am using in my deployment:

channels:
  - conda-forge
  - anaconda
dependencies:
  - tensorflow-gpu==1.12.0
  - pip==21.1.3
  - python==3.7.16
  - Cython==0.29.14
  - pycocotools==2.0.0
  - pip:
      - setuptools==59.6.0
      - wheel==0.37.1
      - Pillow==6.2.1
      - numpy==1.17.4
      - pandas==0.25.3
      - azure-storage-blob==12.7.0
      - XlsxWriter==1.2.7
      - google-cloud-vision==0.42.0
      - tensorboard==1.13.1
      - tensorflow==1.13.1
      - absl-py>=0.11.0
      - Keras-Applications==1.0.8
      - Keras-Preprocessing==1.1.0
      - scikit-image==0.16.2
      - scipy==1.3.3
      - matplotlib==3.1.2
      - zipp==0.6.0
      - urllib3==1.25.7
      - ipykernel==5.1.3
      - ipyparallel==6.2.4
      - ipython==7.9.0
      - ipywidgets==7.5.1
      - pathlib==1.0.1
      - pathlib2==2.3.5
      - apscheduler==3.6.3
      - scikit-learn==0.23.1
      - google-api-core==1.16.0
      - google-auth==1.7.1
      - google-auth-oauthlib==0.4.1
      - google-cloud-vision==0.42.0
      - google-pasta==0.1.8
      - h5py==2.10.0
      - python-http-client==3.2.5
      - sendgrid==6.1.2
      - azure-cognitiveservices-vision-computervision==0.7.0
      - keras==2.1.0
      - grpcio==1.25.0
      - graphviz==0.8.4
      - ipython-genutils==0.2.0
      - azureml-defaults>=1.13.0
      - opencv-python==4.1.2.30
      - imgaug==0.3.0
      - googleapis-common-protos==1.51.0
      - pathos
name: tensorflow-gpu-1.12-cuda11

Solution:

I had to upgrade my Tensorflow to 2+ (I chose 2.0.0). Consequently, I had to modify my code to use Tensorflow v2. There are specific steps given on how to migrate to v2.

In my case, my code was MRCNN and there plenty of blogs/SO question/git repos which were helpful in properly migrating my code to V2.


Solution

  • TypeError: metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases

    This is because of clashes between the metaclasses used in different classes or modules.

    • Given that you're using TensorFlow and absl, this issue stem from the interaction between TensorFlow and another library that defines custom metaclasses.

    Mostly, updating or downgrading specific dependencies can resolve metaclass conflicts.

    A specific package as the stable version for your requirement tensorflow==2.16.1 Add this to your config file

    • Upgrade your project's dependencies to the specified version
    pip install -r requirements.txt --upgrade
    

    Sometimes when Tensorflow 2 and "the application" is installed, as the error gets like ~ python AssertionError: only tensorflow v1 is supported .

    • We require Tensorflow 1.x version, Tensorflow 1.15 version is available for Windows for 64-bit Python 3.6 but downgrading to Python 3.6 is no longer supported

    Install the latest version with pip install 'tensorflow<2.0' and update it in requirements.txt or similar.