Search code examples
kerasdeep-learningfeature-extractiontraining-datapre-trained-model

How to extract specific features of same person using different image?


The aim of my project is extracting specific facial features on mobile phone. This a verification application using user's face. Given two different images of the same person, extracting the features as close as possible.

Right now, I use the pretrained model and weights of VGGFace team as a feature extractor, you can download the model in here. However, when I extracted features based on the model, the result was not good enough, I described what I did and what I want as below:

I extract features from Emma Watson' images, image_1 returns feature_1, image2 returns feature_2 and so on (vector length = 2048). If feature[i] > 0.0, convert it to 1.

for i in range(0, 2048): if feature1[0][i] > 0.0: feature1[0][i] = 1

Then, I compare the two features vector using Hamming distance. Hamming distance is just a naive way to compare, in real project, I will quantize those features before comparing. However, the distance between 2 images of Emma still large even though I use 2 neural facial expression images (same emotion, different emotion type return worse result).

My question is how could I train the model to extract features of target user. Imaging, Emma is a target user, and her phone only need to extract her features. When someone try to unlock Emma's phone, her phone extract this person's face then compare with saved Emma's features. In addition, I don't want to train a model to classify 2 classes Emma and not Emma. The thing I need is comparing extracted features.

To sum up, If we compare features from different images of the same person, the distance (differences) should be "close" (small). If we compare features from different images of different people, the distance should be "far" (large).

Thank you so much.


Solution

  • I'd do the following: We want to compute the features from a deep layer from a ConvNet to ultimately compare new images with a base image. Let's say this deep layer gives you the feature vector f. Now, create a dataset with pairs of images and a label y. Say, y = 1 if both images are of same person as the base image and y = 0 if they are different. Then, calculate the element wise difference and feed it into a logistic regression unit to get your y_hat: y_hat = sigmoid(np.multiply(W, np.sum(abs(f1 - f2)) + b). You will have to create a "Siamese" network where you have two same ConvNets, one giving you f1 for one image and another one for f2 for another image from the same example pair. Siamese networks need to have the exact weights at all times so you will need to ensure that their weights are same as each other at all times. As you train this new network, you should get desired results.