OpenAI documentation about factual responses indicates it's possible to give my model a ground truth but I don't get how to do so.
I would like to provide a long documentation, for instance, thousand wikipedia articles, then provide examples of Questions and Answers. These Q&A would teach my model the expected format of the answer and to tell that it doesn't know, when it hasn't any response based on the provided articles.
In the API documentation I only see that I can provide a dataset of Q&A but no way to provide ground truth data.
In the OpenAI documentation itself it’s recommended to use embeddings instead of fine-tuning for knowledge. This is recommended from OpenAI.
Fine-tuning is not knowledge. You can find the message (from OpenAI) below in their cookbook.
Note: To answer questions based on text documents, we recommend the procedure in Question Answering using Embeddings.
https://community.openai.com/t/fine-tuning-myths-openai-documentation/133608/1