In the documentation for GPT-3 API, it says:
One limitation to keep in mind is that, for most models, a single API request can only process up to 2,048 tokens (roughly 1,500 words) between your prompt and completion.
In the documentation for fine tuning model, it says:
The more training samples you have, the better. We recommend having at least a couple hundred examples. in general, we've found that each doubling of the dataset size leads to a linear increase in model quality.
My question is, does the 1,500 words limit also apply to fine tune model? Does "Doubling of the dataset size" mean number of training datasets instead of size of each training dataset?
GPT-3 models have token limits because you can only provide 1 prompt and get 1 completion. Therefore, as stated in the official OpenAI article:
Depending on the model used, requests can use up to 4097 tokens shared between prompt and completion. If your prompt is 4000 tokens, your completion can be 97 tokens at most.
Whereas, fine-tuning as such doesn't have a token limit (i.e., you can have a million training examples, a million prompt-completion pairs), as stated in the official OpenAI documentation:
The more training examples you have, the better. We recommend having at least a couple hundred examples. In general, we've found that each doubling of the dataset size leads to a linear increase in model quality.
But, each fine-tuning prompt-completion pair does have a token limit. Each fine-tuning prompt-completion pair should not exceed the token limit.