Search code examples
nlpspacyspacy-3

Spacy returns "AttributeError: 'spacy.tokens.doc.Doc' object has no attribute 'spans'" in simple .spans assignment. Why?


I'm just trying to mark subparts of a document as spans as per Spacy's documentation

import spacy
nlp = spacy.load('en_core_web_sm')
sentence = "The car with the white wheels was being confiscated by the police when the owner returns from robbing a bank"
doc = nlp(sentence)

doc.spans['remove_parts'] = [doc[2:6], doc[9:12]]
doc.spans['remove_parts']

This looks pretty straight forward, but Spacy returns the following error (and attributes it to the second line i.e. the assignment):

AttributeError: 'spacy.tokens.doc.Doc' object has no attribute 'spans'

I can't see what's going on at all. Is this a Spacy bug? Has spans property been removed even though it is still in the documentation? If not what am I missing?

PD: I'm using Colab for this. And spacy.info shows:

spaCy version    2.2.4                         
Location         /usr/local/lib/python3.7/dist-packages/spacy
Platform         Linux-4.19.112+-x86_64-with-Ubuntu-18.04-bionic
Python version   3.7.10                        
Models           en    

Solution

  • This code:

    nlp = English()
    text = "The car with the white wheels was being confiscated by the police when the owner returns from robbing a bank"
    doc = nlp(text)
    doc.spans['remove_parts'] = [doc[2:6], doc[9:12]]
    doc.spans['remove_parts']
    

    should work correctly from spaCy v3.0 onwards. If it doesn't - can you verify that you are in fact running the code from the correct virtual environment within colab (and not a different environment using spaCy v2)? We have previously seen issues where Colab would still be accessing older installations of spaCy on the system, instead of sourcing the code from the correct venv. To double check, you can try running the code in a Python console directly instead of through Colab.