How to solve "TypeError: list indices must be integers or slices, not str" with a list of dictionaries?

I have this list of words and their corresponding POS and other values:

sentence= [[{'entity': 'adj', 'score': 0.9004535, 'index': 1, 'word': 'we', 'start': 0, 'end': 7}], [{'entity': 'verb', 'score': 0.8782018, 'index': 1, 'word': 'have', 'start': 0, 'end': 6}], [{'entity': 'verb', 'score': 0.9984743, 'index': 1, 'word': 'become', 'start': 0, 'end': 3}], [{'entity': 'noun', 'score': 0.9953852, 'index': 1, 'word': 'see', 'start': 0, 'end': 6}]]

I'm trying to extract all words that are not "verbs" or "prep". on other words, I want to exclude verbs and prepositions. I used this code:

sentence = [ sub['word'] for sub in sentence if sub['entity']!='verb' ]

But I get this error:

TypeError: list indices must be integers or slices, not str

Thank you

Solution

Your input datum is a list of lists. Each sub-list contains a single element which is a dictionary. The fact that the individual dictionaries are in a list implies that there might be more than one dictionary in each sub-list (otherwise why would you use a list?). Your code should account for that.

The safest way to deal with this is to write a generator that iterates over both list levels and yields relevant results.

For example:

sentence= [[{'entity': 'adj', 'score': 0.9004535, 'index': 1, 'word': 'we', 'start': 0, 'end': 7}], [{'entity': 'verb', 'score': 0.8782018, 'index': 1, 'word': 'have', 'start': 0, 'end': 6}], [{'entity': 'verb', 'score': 0.9984743, 'index': 1, 'word': 'become', 'start': 0, 'end': 3}], [{'entity': 'noun', 'score': 0.9953852, 'index': 1, 'word': 'see', 'start': 0, 'end': 6}]]
# ignore any entities given in the second argument (list)
def extract(_list, ignore):
    for element in _list:
        for _dict in element:
            if _dict.get('entity') not in ignore:
                yield _dict.get('word')

for word in extract(sentence, ['verb', 'prep']):
    print(word)

Output:

we
see