Search code examples
machine-learningartificial-intelligencetransfer-learning

Differences between Transfer Learning and Meta Learning


What are the differences between meta learning and transfer learning?

I have read 2 articles on Quora and TowardDataScience.

Meta learning is a part of machine learning theory in which some algorithms are applied on meta data about the case to improve a machine learning process. The meta data includes properties about the algorithm used, learning task itself etc. Using the meta data, one can make a better decision of chosen learning algorithm(s) to solve the problem more efficiently.

and

Transfer learning aims at improving the process of learning new tasks using the experience gained by solving predecessor problems which are somewhat similar. In practice, most of the time, machine learning models are designed to accomplish a single task. However, as humans, we make use of our past experience for not only repeating the same task in the future but learning completely new tasks, too. That is, if the new problem that we try to solve is similar to a few of our past experiences, it becomes easier for us. Thus, for the purpose of using the same learning approach in Machine Learning, transfer learning comprises methods to transfer past experience of one or more source tasks and makes use of it to boost learning in a related target task.

The comparisons still confuse me as both seem to share a lot of similarities in terms of reusability. Meta learning is said to be "model agnostic", yet it uses metadata (hyperparameters or weights) from previously learned tasks. It goes the same with transfer learning as it may reuse partially a trained network to solve related tasks. I understand that there are a lot more to discuss but broadly speaking, I do not see so much difference between the two.

People also use terms like meta-transfer learning which makes me think both types of learning have a strong connection with each other.


Solution

  • In transfer learning, we pre-train model parameters with a large dataset and then use those parameters as initial parameters to finetune on some other task having a smaller dataset. this classic pre-training approach has no guarantee of learning an initialization that is good for fine-tuning. In meta-learning, we learn an initial set of parameters that can be finetuned easily on another similar task with only a few gradient steps. It directly optimizes performance with respect to this initialization differentiating through the fine-tuning process.