image-processing machine-learning tensorflow deep-learning keras-layer

How to use a neural network when there are many classes

We are using facenet and have generated embeddings (128 features) for faces https://github.com/davidsandberg/facenet. We have 100k classes (celebrities) from MSCeleb http://www.msceleb.org/ and 8M samples.

How does one construct a neural network that can map the 128 features to 100k classes?

Using a fully connected layer would result in (128 + 1)*100k = 12.9 million parameters which seems too large to train.

Solution

From the FaceNet abstract:

In this paper we present a system, called FaceNet, that directly learns a mapping from face images to a compact Euclidean space where distances directly correspond to a measure of face similarity. Once this space has been produced, tasks such as face recognition, verification and clustering can be easily implemented using standard techniques with FaceNet embeddings as feature vectors.

Instead of training a classifier, consider doing a nearest neighbor search in the feature space. You can select anchor images for each of your 100k celebrities and then build ak-d tree from their feature vectors. Then for each input you can find its nearest neighbor in the k-d tree.