Search code examples
pythonspacyspacy-3

Migrate trained Spacy 2 pipelines to Spacy 3


I've been using spacy 2.3.1 until now and have trained and saved a couple of pipelines for my custom Language class. But now using spacy 3.0 and spacy.load('model-path') I'm facing problems such as config.cfg file not found and other kinds of errors.

Do I have to train the models from scratch after upgrading the spacy? Is there any step-by-step guide for migrating trained models?


Solution

  • I'm afraid you won't be able to just migrate the trained pipelines. The pipelines trained with v2 are not compatible with v3, so you won't be able to just use spacy.load on them.

    You'll have to migrate your codebase to v3, and retrain your models. You have two options:

    • Update your training loop to change the API calls from v2 to v3, cf for more details here: https://spacy.io/usage/v3#migrating
    • (recommended approach): transform your training code entirely to the new config system in v3. While this may seem like a big difference, you'll get the hang of the config system quite quickly, and you'll notice how much more powerful & convenient it is, as compared to writing everything yourself from scratch. To get started with the config system, have a look at the init config command, e.g.:
    python -m spacy init config config.cfg --lang en --pipeline ner,textcat --optimize accuracy
    

    This will provide you some sensible defaults to start from, and a config file that you can customize further according to your requirements.