Search code examples
simpletransformers

Simple transformers ConvAI Model completely freezes and crashes PC. Two different problems on two different machines


I am trying to load and train a ConvAI Model, fairly new to the whole concept. Keep running into two main problems - one on my personal computer, one on a corporate machine.

On my personal machine, it says that there is a memory problem. I would run the code again but it completely crashes my computer.

I am trying to load and train a ConvAI Model, fairly new to the whole concept. Keep running into two main problems - one on my personal computer, one on a corporate machine.

On my personal machine, it says that there is a memory problem. I would run the code again but it completely crashes my computer.

On the corporate machine, "The freeze_support() line can be omitted if the program is not going to be frozen to produce an executable" this is printed on the terminal and it just runs for eternity it seems. Again, I would run the code but I am afraid it might crash the corporate machine.

Here is my code:

from simpletransformers.conv_ai import ConvAIModel,ConvAIArgs

argsConv = {
    "num_train_epochs": 2,
    "save_model_every_epoch": False,
    "overwrite_output_dir": True
}

model = ConvAIModel(
    model_type="gpt",
    model_name="gpt_personachat_cache",
    use_cuda=False
)

model.train_model(args=argsConv)
result,modelOutputs,wrongPreds = model.eval_model(eval_file="./eval.json")

I cannot figure out what the issue is. Why does it keep crashing? I have downloaded the gpt_personachat_cache model. The only thing that this code is downloading is the train JSON file as mentioned on the Simple Transformers ConvAI Model website.

For output, please refer: simpletransformers model trained on Colab doesn't work locally

Similar output, completely freezes and then the errors mentioned occur.


Solution

  • The freeze_support error message is a known issue with simpletransformers when run as a local Python script. Try putting your program (or at least the train_model() part) in a if __name__ == '__main__' condition, so into a main function that only gets called once when running and not on import.

    if __name__ == '__main__':
        model.train_model(args=argsConv)
        result,modelOutputs,wrongPreds = model.eval_model(eval_file="./eval.json")
    

    Running it on Jupyter Notebooks such as Colab does not have this problem as this is always done implicitly.