Search code examples
machine-learningclassificationdata-sciencesupervised-learningmulticlass-classification

How to classify an imbalanced dataset when given 0 samples of a particular class?


Basically I have a training set and test set given, the training set is what i will test various models and feature selections on, i know the output labels of the training set and they are of 10 different categories, but I am told/given that one of the particular classes has 0 given samples/occurences in the training set.

How do I deal with this?

I know I can use oversampling/undersampling with imbalanced sets, but will it help for this if one of the classes has 0 occurences?


Solution

  • Your usecase falls into the domain of zero-shot learning originally introduced as zero-data learning. It relies on building separable representations of the underlying classes in a way that can be generalized beyond given samples. It's not an easy problem to solve, but depending on your data and problem space it might be doable. Some resources to get you started:

    1. Zero-Data Learning
    2. Deep Learning Book: Representation Learning