Search code examples
pythonmachine-learningpytorchcoremlcoremltools

Converting Depth-Anything To CoreML


I'm trying to convert existing depth-anything PyTorch model to CoreML format. I decided to use Google Colab and took the following note for inferencing depth-anything model. However, I meet some exception while trying to import it on iOS side. Here is my code snippet for converting:

# Installing all needed extensions
!pip install coremltools
# ...

import coremltools as ct
import torch

# Convert the PyTorch model to TorchScript
traced_model = torch.jit.trace(depth_anything, torch.rand(1, 3, 518, 518))

# Convert the TorchScript model to CoreML
model_coreml = ct.convert(
    traced_model,
    inputs=[ct.ImageType(name="input_1", shape=(1, 3, 518, 518), scale=1/255.0)]
)

output = model_coreml._spec.description.output[0]
output.type.imageType.colorSpace = ct.proto.FeatureTypes_pb2.ImageFeatureType.ColorSpace.Value('RGB')
output.type.imageType.width = 518
output.type.imageType.height = 518

# Save the modified CoreML model
print(model_coreml)
model_coreml.save('/content/drive/MyDrive/trained_models/depth9.mlpackage')

I've tried to specify input parameters straight as I do for the output one like this:

# Create a dictionary for the input schema
input_schema = {'input_name': 'input', 'input_type': ct.TensorType(shape=(1, 3, 518, 518))}

# Add the input schema to the model's metadata
model_coreml.user_defined_metadata['inputSchema'] = str(input_schema)

Or to use convert_to option with setting up neuralnetwork like this:

model_coreml = ct.convert(
    traced_model,
    inputs=[ct.ImageType(name="input_1", shape=(1, 3, 518, 518), scale=1/255.0)],
    convert_to='neuralnetwork'
)

Or to set ct.proto.FeatureTypes_pb2.ImageFeatureType.ColorSpace.Value('RGB') with BGR/GRAYSCALE
Nothing helps.

If I try to import the model with neuralnetwork backend I just receive an infinite loading. If I try to import the model with mlprogram backend (default, if not specified) I receive the following:
Exception while importing mlprogram backend

I look forward for any advices and help since all I need is just to convert existing depth-anything model with no adjustments or changes to CoreML format. Thanks!


Solution

  • Well, actually using neuralnetwork backend and knowing the fact that depth shape is 1xHxW the following modifications made for shape and scale values did the trick:

    import coremltools as ct
    import torch
    
    x = torch.rand(1, 3, 518, 518)
    traced_model = torch.jit.trace(depth_anything, x, strict=False)
    
    mlmodel = ct.convert(traced_model,inputs=[ct.ImageType(shape=x.shape,bias=[-0.485/0.229,-0.456/0.224,-0.406/0.225],scale=1.0/255.0/0.226)], convert_to='neuralnetwork')
    
    mlmodel.save('/content/drive/MyDrive/trained_models/depth_anything.mlmodel')
    

    I'm not sure it's a good solution, but as a workaround I've achieved it had covered all my needs