Search code examples
pytorchcoremlonnx-coreml

CoreML: creating a custom layer for ONNX RandomNormal


I've trainined a VAE that in PyTorch that I need to convert to CoreML. From this thread PyTorch VAE fails conversion to onnx I was able to get the ONNX model to export, however, this just pushed the problem one step further to the ONNX-CoreML stage.

The original function that contains the torch.randn() call is the reparametrize func:

def reparametrize(self, mu, logvar):
    std = logvar.mul(0.5).exp_()
    if self.have_cuda:
        eps = torch.randn(self.bs, self.nz, device='cuda')
    else:
        eps = torch.randn(self.bs, self.nz)
    return eps.mul(std).add_(mu)

The solution is, of course, to create a custom layer, but I'm having problems creating a layer with no inputs (i.e., it's just a randn() call).

I can get the CoreML conversion to complete with this def:

def convert_randn(node):
    params = NeuralNetwork_pb2.CustomLayerParams()
    params.className = "RandomNormal"
    params.description = "Random normal distribution generator"
    params.parameters["dtype"].intValue = node.attrs.get('dtype', 1)
    params.parameters["bs"].intValue = node.attrs.get("shape")[0]
    params.parameters["nz"].intValue = node.attrs.get("shape")[1]
    return params

I do the conversion with:

coreml_model = convert(onnx_model, add_custom_layers=True, 
    image_input_names = ['input'], 
    custom_conversion_functions={"RandomNormal": convert_randn})

I should also note that, at the completion of the mlmodel export, the following is printed:

Custom layers have been added to the CoreML model corresponding to the 
following ops in the onnx model: 
1/1: op type: RandomNormal, op input names and shapes: [], op output     
names and shapes: [('62', 'Shape not available')]

Bringing the .mlmodel into Xcode complains that Layer '62' of type 500 has 0 inputs but expects at least 1. So I'm wondering how to specify a kind of "dummy" input to the layer, since it doesn't actually have an input -- it's just a wrapper around torch.randn() (or, more specifically, the onnx RandonNormal op). I should clarify that I do need the whole VAE, not just the decoder, as I'm actually using the entire process to "error correct" my inputs (i.e., the encoder estimates my z vector, based on an input, then the decoder generates the closest generalizable prediction of the input).

Any help greatly appreciated.

UPDATE: Okay, I finally got a version to load in Xcode (thanks to @MattijsHollemans and his book!). The originalConversion.mlmodel is the initial output of converting my model from ONNX to CoreML. To this, I had to manually insert the input for the RandomNormal layer. I made it (64, 28, 28) for no great reason — I know my batch size is 64, and my inputs are 28 x 28 (but presumably it could also be (1, 1, 1), since it's a "dummy"):

spec = coremltools.utils.load_spec('originalConversion.mlmodel')
nn = spec.neuralNetwork
layers = {l.name:i for i,l in enumerate(nn.layers)}
layer_idx = layers["62"] # '62' is the name of the layer -- see above
layer = nn.layers[layer_idx]
layer.input.extend(["dummy_input"])

inp = spec.description.input.add()
inp.name = "dummy_input"
inp.type.multiArrayType.SetInParent()
spec.description.input[1].type.multiArrayType.shape.append(64)
spec.description.input[1].type.multiArrayType.shape.append(28)
spec.description.input[1].type.multiArrayType.shape.append(28)
spec.description.input[1].type.multiArrayType.dataType = ft.ArrayFeatureType.DOUBLE

coremltools.utils.save_spec(spec, "modelWithInsertedInput.mlmodel") 

This loads in Xcode, but I have yet to test the functioning of the model in my app. Since the additional layer is simple, and the input is literally a bogus, non-functional input (just to keep Xcode happy), I don't imagine it will be a problem, but I'll post again if it doesn't run properly.

UPDATE 2: Unfortunately, the model doesn't load at runtime. It fails with [espresso] [Espresso::handle_ex_plan] exception=Failed in 2nd reshape after missing custom layer info. What I find very strange and confusing is that, inspecting model.espresso.shape, I see that almost every node has a shape like:

"62" : {
  "k" : 0,
  "w" : 0,
  "n" : 0,
  "seq" : 0,
  "h" : 0
}

I have two question/concerns: 1) Most obviously, why are all the values zero (this is the case with all but the input nodes), and 2) Why does it appear to be a sequential model, when it's just a fairly conventional VAE? Opening model.espresso.shape for a fully-functioning GAN in the same app, I see that the nodes are of the format:

"54" : {
  "k" : 256,
  "w" : 16,
  "n" : 1,
  "h" : 16
}

That is, they contain reasonable shape info, and they don't have seq fields.

Very, very confused...

UPDATE 3: I've also just noticed in the compiler report the error: IMPORTANT: new sequence length computation failed, falling back to old path. Your compilation was sucessful, but please file a radar on Core ML | Neural Networks and attach the model that generated this message.

Here's the original PyTorch model:

class VAE(nn.Module):
def __init__(self, bs, nz):
    super(VAE, self).__init__()

    self.nz = nz
    self.bs = bs

    self.encoder = nn.Sequential(
        # input is (nc) x 28 x 28
        nn.Conv2d(nc, ndf, 4, 2, 1, bias=False),
        nn.LeakyReLU(0.2, inplace=True),
        # size = (ndf) x 14 x 14
        nn.Conv2d(ndf, ndf * 2, 4, 2, 1, bias=False),
        nn.BatchNorm2d(ndf * 2),
        nn.LeakyReLU(0.2, inplace=True),
        # size = (ndf*2) x 7 x 7
        nn.Conv2d(ndf * 2, ndf * 4, 3, 2, 1, bias=False),
        nn.BatchNorm2d(ndf * 4),
        nn.LeakyReLU(0.2, inplace=True),
        # size = (ndf*4) x 4 x 4
        nn.Conv2d(ndf * 4, 1024, 4, 1, 0, bias=False),
        nn.LeakyReLU(0.2, inplace=True),
    )

    self.decoder = nn.Sequential(
        # input is Z, going into a convolution
        nn.ConvTranspose2d(     1024, ngf * 8, 4, 1, 0, bias=False),
        nn.BatchNorm2d(ngf * 8),
        nn.ReLU(True),
        # size = (ngf*8) x 4 x 4
        nn.ConvTranspose2d(ngf * 8, ngf * 4, 3, 2, 1, bias=False),
        nn.BatchNorm2d(ngf * 4),
        nn.ReLU(True),
        # size = (ngf*4) x 8 x 8
        nn.ConvTranspose2d(ngf * 4, ngf * 2, 4, 2, 1, bias=False),
        nn.BatchNorm2d(ngf * 2),
        nn.ReLU(True),
        # size = (ngf*2) x 16 x 16
        nn.ConvTranspose2d(ngf * 2,     nc, 4, 2, 1, bias=False),
        nn.Sigmoid()
    )

    self.fc1 = nn.Linear(1024, 512)
    self.fc21 = nn.Linear(512, nz)
    self.fc22 = nn.Linear(512, nz)

    self.fc3 = nn.Linear(nz, 512)
    self.fc4 = nn.Linear(512, 1024)

    self.lrelu = nn.LeakyReLU()
    self.relu = nn.ReLU()

def encode(self, x):
    conv = self.encoder(x);
    h1 = self.fc1(conv.view(-1, 1024))
    return self.fc21(h1), self.fc22(h1)

def decode(self, z):
    h3 = self.relu(self.fc3(z))
    deconv_input = self.fc4(h3)
    deconv_input = deconv_input.view(-1,1024,1,1)
    return self.decoder(deconv_input)

def reparametrize(self, mu, logvar):
    std = logvar.mul(0.5).exp_()
    eps = torch.randn(self.bs, self.nz, device='cuda') # needs custom layer!
    return eps.mul(std).add_(mu)

def forward(self, x):
    # print("x", x.size())
    mu, logvar = self.encode(x)
    z = self.reparametrize(mu, logvar)
    decoded = self.decode(z)
    return decoded, mu, logvar

Solution

  • To add an input to your Core ML model, you can do the following from Python:

    import coremltools
    spec = coremltools.utils.load_spec("YourModel.mlmodel")
    
    nn = spec.neuralNetworkClassifier  # or just spec.neuralNetwork
    
    layers = {l.name:i for i,l in enumerate(nn.layers)}
    layer_idx = layers["your_custom_layer"]
    layer = nn.layers[layer_idx]
    layer.input.extend(["dummy_input"])
    
    inp = spec.description.input.add()
    inp.name = "dummy_input"
    inp.type.doubleType.SetInParent()
    
    coremltools.utils.save_spec(spec, "NewModel.mlmodel")
    

    Here, "your_custom_layer" is the name of the layer you want to add the dummy input to. In your model it looks like it's called 62. You can look at the layers dictionary to see the names of all the layers in the model.

    Notes:

    • If your model is not a classifier, use nn = spec.neuralNetwork instead of neuralNetworkClassifier.
    • I made the new dummy input have the type "double". That means your custom layer gets a double value as input.
    • You need to specify a value for this dummy input when using the model.