Search code examples
pythonopen-sourcehuggingface-transformershuggingfacegpt-3

What is the best approach to creating a question generation model using GPT and Bert architectures?


I want to make a question generation model from questions as well as context. Should I make use of GPT based models or Bert Based architectures.

GPT is able to perform the tasks but sometimes returns with vague questions that were not in the context itself. When I made use of WizardLM(7B), I was able to get generalized questions from the context itself which sounded more natural and were nearly to the point when kept within limit of 3.


Solution

  • When dealing with text generation, it is more straightforward to work with Transformer decoder models such as GPT-* models. Although BERT-like models are also capable of text generation, it is a quite convoluted process and not something that follows naturally from the tasks for which these models have been pretrained.

    I assume you are comparing GPT-2 and WizardLM (7B). The performance of the model on this task is expected to improve as you scale up the number of parameters by using larger models. I would recommend you to try LLMs such as Alpaca-LoRA, Dolly or GPT-J ( see here how to run GPT-J on Colab Pro ).