Search code examples
nlp-question-answeringsimpletransformerswandb

How to log artifacts in wandb while using saimpletransformers?


I am creating a Question Answering model using simpletransformers. I would also like to use wandb to track model artifacts. As I understand from wandb docs, there is an integration touchpoint for simpletransformers but there is no mention of logging artifacts.

I would like to log artifacts generated at the train, validation, and test phase such as train.json, eval.json, test.json, output/nbest_predictions_test.json and best performing model.


Solution

  • Currently simpleTransformers doesn't support logging artifacts within the training/testing scripts. But you can do it manually:

    import os 
    
    with wandb.init(id=model.wandb_run_id, resume="allow", project=wandb_project) as training_run:
        for dir in sorted(os.listdir("outputs")):
            if "checkpoint" in dir:
                artifact = wandb.Artifact("model-checkpoints", type="checkpoints")
                artifact.add_dir("outputs" + "/" + dir)
                training_run.log_artifact(artifact)
    

    For more info, you can follow along with the W&B notebook in the SimpleTransofrmer's README.md