Search code examples
tensorflowkeraslogistic-regressionmulticlass-classification

TensorFlow Keras multi classification, what datasets to prepare?


For example, if I want to train a model to classify "dog", "cat" and "neither dog nor cat". Do I need to prepare datasets for "neither dog nor cat"? Is there any way to accomplish it with only "dog" and "cat" datasets?


Solution

  • Yes, its recommended that labelled data has "other" type and an additional output neuron is added to infer other type

    Lets start from a binary classifier for "dog" or "cat"

    1. Mostly softmax activation is used in the output layer
    2. It normalizes the result into one of the two classes
    3. Helps the user to decide on selection easily

    Now let us add a 3rd neuron for "other", we need some data to correctly activate the "other"

    Alternatively,

    1. Use a sigmoid with two neurons
    2. adjust prediction threshold such that if the dog and cat are both below their threshold, then emit neither

    Though this alternate approach works, this may not be recommended as the custom logic outside the scope of model infers about additional class(which is not known to model).

    In future, if some one adds, let say horse(along with dog and cat), the code needs to be modified. It seems to an unnecessary complexity in long run.