Search code examples
machine-learningdeep-learningartificial-intelligencefine-tuningfew-shot-learning

What are the differences between fine tuning and few shot learning?


I am trying to understand the concept of fine-tuning and few-shot learning.

I understand the need for fine-tuning. It is essentially tuning a pre-trained model to a specific downstream task. However, recently I have seen a plethora of blog posts stating zero-shot learning, one-shot learning and few-shot learning.

  • How are they different from fine-tuning? It appears to me that few-shot learning is a specialization of fine-tuning. What am I missing here?

Can anyone please help me?


Solution

  • Fine tuning - When you already have a model trained to perform the task you want but on a different dataset, you initialise using the pre-trained weights and train it on target (usually smaller) dataset (usually with a smaller learning rate).

    Few shot learning - When you want to train a model on any task using very few samples. e.g., you have a model trained on different but related task and you (optionally) modify it and train for target task using small number of examples.

    For example:

    Fine tuning - Training a model for intent classification and then fine tuning it on a different dataset.

    Few shot learning - Training a language model on large text dataset and modifying it (usually last (few) layer) to classify intents by training on small labelled dataset.

    There could be many more ways to do few shot learning. For 1 more example, training a model to classify images where some classes have very small (or 0 for zero shot and 1 for one shot) number of training samples. Here in inference, classifying these rare classes (rare in training) correctly becomes the aim of few shot learning.