Search code examples
machine-learningdeep-learningartificial-intelligencefine-tuningfew-shot-learning

What are the differences between adapter tuning and prefix tuning?


I am trying to understand the concept of adapter-tuning, prompt-tuning, and prefix-tuning in the context of few-shot learning.

It appears to me that I can apply prompt tuning to a black box language model.

I read for prompt tuning the entire pre-trained language model is frozen. If that's the case prompt tuning could be applied for an OpenAI model like gpt-3 and Codex.

How could I do prompt tuning with OpenAI Codex? I don't find any way so far.

How these techniques are different than in-context example that could be given by few-shot learning.

Can anyone please guide me in the correct direction?


Solution

  • These are alternatives to fine-tuning model. They are essentially solutions that reside between few-shot learning and complete fine-tuning of models.

    The other answer in this SO post is completely wrong. Fine-tuning has nothing to do with neither prompt tuning nor prefix tuning. These two are completely different techniques than fine-tuning.

    Correct reference to prompt tuning and prefix tuning are given below:

    • Prompt Tuning: For prompt tuning k learnable parameter i.e. continuous token embeddings is appended to the input. But the entire pre-trained language model is frozen.

    • Prefix Tuning: For k positions prepended to the input, concatenate additional learnable weights for keys and values at every attention layer. Different to prompt tuning (only learnable input vectors).

    Papers that introduced these techniques are given below: