Search code examples
machine-learningtraining-datagpt-2gpt-3

Fine-tuning GPT-2/3 on new data


I'm trying to wrap my head around training OpenAI's language models on new data sets. Is there anyone here with experience in that regard? My idea is to feed either GPT-2 or 3 (I do not have API access to 3 though) with a textbook, train it on it and be able to "discuss" the content of the book with the language model afterwards. I don't think I'd have to change any of the hyperparameters, I just need more data in the model.

Is it possible??

Thanks a lot for any (also conceptual) help!


Solution

  • Presently GPT-3 has no way to be finetuned as we can do with GPT-2, or GPT-Neo / Neo-X. This is because the model is kept on their server and requests has to be made via API. A Hackernews post says that finetuning GPT-3 is planned or in process of construction.

    Having said that, OpenAI's GPT-3 provide Answer API which you could provide with context documents (up to 200 files/1GB). The API could then be used as a way for discussion with it.

    EDIT: Open AI has recently introduced Fine Tuning beta. https://beta.openai.com/docs/guides/fine-tuning Thus it will be best answer to the question to follow through description on that link.