Search code examples
keraspytorchconv-neural-networkdata-conversion

Converted model from keras h5 to pytorch - fully connected layer mismatch


I have converted two models (vgg16 and resnet50) from Keras with TensorFlow backend (from as model.save file) into PyTorch using mmdnn. This was done with the following:

mmconvert -sf keras -iw vgg.h5 -df pytorch -om keras_to_torch.pt

A = imp.load_source('MainModel','/weights/keras_to_torch.py')
model = torch.load('/weights/keras_to_torch.pt')

Predicting on the same data set gave me a different set of results so I investigated further.

I can see that the weights for all the convolutional layers are the same (after transposing), however the weights of the fully connected layers at the end are not.

Is there a reason this should be? As i understand they should be equivalent


Solution

  • The problem must be in the way you defined your keras model, since I cannot replicate the issue using the h5 file that is provided using the MMdnn package. If you want to use the resnet50 and VGG19 model you can get the correct weights as follows:

    • start MMdnn container as specified in the documentation download keras model for resnet50

    mmdownload -f keras -n resnet50 -o ./

    • convert to pytorch model
    mmconvert -sf keras -iw ./imagenet_resnet50.h5 -df pytorch -om keras_to_torch.pt
    

    Then extract the produced numpy file, keras_to_torch.pt and keras_to_torch.py from the docker container (and imagenet_resnet50.h5 for comparison).

    In Python load the keras model with

    import keras
    model = load_model('imagenet_resnet50.h5')
    

    and the torch model using

    import imp
    import torch
    torch_weights = # path_to_the_numpy_weights
    A = imp.load_source('MainModel','keras_to_torch.py')
    weights_torch = A.load_weights(torch_weights)
    model_torch = A.KitModel(torch_weights)
    

    I also had to set allow_pickle = True in the load_weights(weight_file) function at the beginning of the keras_to_torch.py file. The torch.load('/weights/keras_to_torch.pt') variant threw an error for me unfortunately.

    Print the weights of the last densely connected layer

    # keras model
    model.layers[-1].weights
    
    # Output:
    #tensor([[-0.0149,  0.0113, -0.0507,  ..., -0.0218, -0.0776,  0.0102],
    #        [-0.0029,  0.0032,  0.0195,  ...,  0.0362,  0.0035, -0.0332],
    #        [-0.0175,  0.0081,  0.0085,  ..., -0.0302,  0.0549, -0.0251],
    #        ...,
    #        [ 0.0253,  0.0630,  0.0204,  ..., -0.0051, -0.0354, -0.0131],
    #        [-0.0062, -0.0162, -0.0122,  ...,  0.0138,  0.0409, -0.0186],
    #        [-0.0267,  0.0131, -0.0185,  ...,  0.0630,  0.0256, -0.0069]])
    
    # torch model (make sure to transpose)
    model_torch.fc1000.weight.data.T
    
    # Output:
    #[<tf.Variable 'fc1000/kernel:0' shape=(2048, 1000) dtype=float32, numpy=
    # array([[-0.01490746,  0.0113374 , -0.05073728, ..., -0.02179668,
    #         -0.07764222,  0.01018347],
    #        [-0.00294467,  0.00319835,  0.01953556, ...,  0.03623696,
    #          0.00350259, -0.03321117],
    #        [-0.01751374,  0.00807406,  0.00851311, ..., -0.03024036,
    #          0.05494978, -0.02511911],
    #        ...,
    #        [ 0.025289  ,  0.0630148 ,  0.02041481, ..., -0.00508354,
    #         -0.03542514, -0.01306196],
    #        [-0.00623157, -0.01624131, -0.01221174, ...,  0.01376359,
    #          0.04087579, -0.0185826 ],
    #        [-0.02668471,  0.0130982 , -0.01847764, ...,  0.06304929
    #...
    

    The weights of the keras and torch model coincide as desired (up to 4 digits or so). This solution works as long as you don't want to update the VGG and ResNet weights in keras before converting them to Pytorch.

    If you do need to update the model weights before converting you should share your code for creating the Keras model. You could further inspect how the imagenet_resnet50.h5 obtained with mmdownload model differs from the one you saved with model.save in keras and correct for any differences.