Search code examples
pythonlisttokenspacy

Difference between two list with same 'list' type (with and without apostrophe ' ')--python


I have got two lists, list_1 is defined by me directly, the other is generated by using an operation in spaCy, they both return a 'list' type, but they are obviously different, one with ' ', one without ' '.

Question_1:

are they both totally the same type of list in python?

import sys
import re
import spacy
from spacy.tokens import Token
nlp = spacy.load("en_core_web_sm")
nlp = spacy.load("en_core_web_md")

list_1 = ['apple', 'orange', 'banana', 'this is a dog']
print(list_1, type(list_1))

sentence = 'apple and orange and banana this is a dog'
doc = nlp(sentence)
list_2 = []
for i in doc.noun_chunks:
    list_2.append(i)
print(list_2,type(list_2))

Output:

list_1: ['apple', 'orange', 'banana', 'this is a dog'] <class 'list'>
list_2: [apple, orange, banana, a dog] <class 'list'>

Question_2:

How to solve the following error?

I assume they are totally the same (type), but when I use the list_2 as a normal list, in the following code, it returns an error.

for i in list_2:
    if "dog" in i:
        print(list_2.index(i))

Error:

TypeError                                 Traceback (most recent call last)
<ipython-input-110-6f9c38535050> in <module>
     16 print(list_2,type(list_2))
     17 for i in list_2:
---> 18     if "dog" in i:
     19         print(list_2.index(i))
     20 

TypeError: Argument 'other' has incorrect type (expected spacy.tokens.token.Token, got str)

Thanks!


Solution

  • Looks like spacy is giving you an object, not just the text ... when you use nlp, try:

    for i in doc.noun_chunks:
        list_2.append(i.text)
    

    and that should give you a str to str comparison that you're looking for.