Search code examples
pythonmatlabcaffepycaffematcaffe

Results for python and MATLAB caffe are different for the same network


I am trying to port MATLAB implementation of MTCNN_face_detection_alignment to python. I use the same version of caffe bindings for MATLAB and for python.

Minimal runnable code to reproduce the issue:

MATLAB:

addpath('f:/Documents/Visual Studio 2013/Projects/caffe/matlab');
warning off all
caffe.reset_all();
%caffe.set_mode_cpu();
caffe.set_mode_gpu();
caffe.set_device(0);
prototxt_dir = './model/det1.prototxt';
model_dir = './model/det1.caffemodel';
PNet=caffe.Net(prototxt_dir,model_dir,'test');
img=imread('F:/ImagesForTest/test1.jpg');
[hs ws c]=size(img)
im_data=(single(img)-127.5)*0.0078125;
PNet.blobs('data').reshape([hs ws 3 1]);
out=PNet.forward({im_data});
imshow(out{2}(:,:,2))

Python:

import numpy as np
import caffe
import cv2

caffe.set_mode_gpu()
caffe.set_device(0)

model = './model/det1.prototxt'
weights = './model/det1.caffemodel'

PNet = caffe.Net(model, weights, caffe.TEST) # create net and load weights

print ("\n\n----------------------------------------")
print ("------------- Network loaded -----------")
print ("----------------------------------------\n")

img = np.float32(cv2.imread( 'F:/ImagesForTest/test1.jpg' ))
img=cv2.cvtColor(img,cv2.COLOR_BGR2RGB)

avg = np.array([127.5,127.5,127.5])
img = img - avg
img = img*0.0078125;
img = img.transpose((2,0,1)) 
img = img[None,:] # add singleton dimension

PNet.blobs['data'].reshape(1,3,img.shape[2],img.shape[3])
out = PNet.forward_all( data = img )

cv2.imshow('out',out['prob1'][0][1])
cv2.waitKey()

The model I use located here (det1.prototxt and det1.caffemodel)

The image I used to get these results:

enter image description here

The results I have from both cases:

enter image description here

Results are similar, but not same.

UPD: it looks like not type conversion problem (corrected, but nothing changed). I saved result of convolution after conv1 layer (first channel) in matlab, and extracted the same data in python, both images are now displayed by python cv2.imshow.

Data on input layer (data) are absolutely the same, I did the check using the same method.

And as you can see the difference visible even on the first (conv1) layer. Looks like kernels transformed somehow.

enter image description here

Can anyone say where is the difference hidden ?


Solution

  • I found the issue source, it is because MATLAB sees image transposed. This python code shows the same result as MATLAB one:

    import numpy as np
    import caffe
    import cv2
    
    caffe.set_mode_gpu()
    caffe.set_device(0)
    
    model = './model/det1.prototxt'
    weights = './model/det1.caffemodel'
    
    PNet = caffe.Net(model, weights, caffe.TEST) # create net and load weights
    
    print ("\n\n----------------------------------------")
    print ("------------- Network loaded -----------")
    print ("----------------------------------------\n")
    
    img = np.float32(cv2.imread( 'F:/ImagesForTest/test1.jpg' ))
    img=cv2.cvtColor(img,cv2.COLOR_BGR2RGB)
    img=cv2.transpose(img) # <----- THIS line !
    avg = np.float32(np.array([127.5,127.5,127.5]))
    img = img - avg
    img = np.float32(img*0.0078125);
    img = img.transpose((2,0,1)) 
    img = img[None,:] # add singleton dimension
    
    PNet.blobs['data'].reshape(1,3,img.shape[2],img.shape[3])
    out = PNet.forward_all( data = img )
    
    # transpose it back and show the result  
    cv2.imshow('out',cv2.transpose(out['prob1'][0][1]))
    cv2.waitKey() 
    

    Thanks all who advised me in comments !