Search code examples
pythonnlpspacy

len(nlp.vocab) showing only 486, how I can load total vocab 57852


I have installed Spacy using conda.

conda install -c conda-forge spacy

python -m spacy download en

And installed version was

enter image description here

enter image description here

import spacy

nlp=spacy.load('en_core_web_sm')

doc = nlp(u"Let's visit St. Louis in the U.S. next year.")

len(doc)

len(doc.vocab)

len(nlp.vocab)

len(doc.vocab) and len(nlp.vocab) showing only 486.

How can we load it to show 57852.

enter image description here

Please help me on this.

Thanks, Venkat


Solution

  • It's simple, you have downloaded the small spaCy model. You can download either the medium model (91 MB) or the large model (789 MB) via these commands:

    # medium
    python -m spacy download en_core_web_md
    
    # large
    python -m spacy download en_core_web_lg
    

    To use any one of them, just load it as you did with the small model:

    # medium
    nlp=spacy.load('en_core_web_md')
    
    # large
    nlp=spacy.load('en_core_web_lg')
    

    This link contains all spaCy English models and how to installthem.