Search code examples
pythonnlptokenizespacy

Use spacy Spanish Tokenizer


I always used spacy library with english or german.

To load the library I used this code:

import spacy
nlp = spacy.load('en')

I would like to use the Spanish tokeniser, but I do not know how to do it, because spacy does not have a spanish model. I've tried this

python -m spacy download es

and then:

nlp = spacy.load('es')

But obviously without any success.

Does someone know how to tokenise a spanish sentence with spanish in the proper way?


Solution

  • For version till 1.6 this code works properly:

    from spacy.es import Spanish
    nlp = Spanish()
    

    but in version 1.7.2 a little change is necessary:

    from spacy.es import Spanish
    nlp = Spanish(path=None)
    

    Source:@honnibal in gitter chat