Search code examples
pythonlistpart-of-speech

How to solve "TypeError: list indices must be integers or slices, not str" with a list of dictionaries?


I have this list of words and their corresponding POS and other values:

sentence= [[{'entity': 'adj', 'score': 0.9004535, 'index': 1, 'word': 'we', 'start': 0, 'end': 7}], [{'entity': 'verb', 'score': 0.8782018, 'index': 1, 'word': 'have', 'start': 0, 'end': 6}], [{'entity': 'verb', 'score': 0.9984743, 'index': 1, 'word': 'become', 'start': 0, 'end': 3}], [{'entity': 'noun', 'score': 0.9953852, 'index': 1, 'word': 'see', 'start': 0, 'end': 6}]]

I'm trying to extract all words that are not "verbs" or "prep". on other words, I want to exclude verbs and prepositions. I used this code:

sentence = [ sub['word'] for sub in sentence if sub['entity']!='verb' ]

But I get this error:

TypeError: list indices must be integers or slices, not str

Thank you


Solution

  • Your input datum is a list of lists. Each sub-list contains a single element which is a dictionary. The fact that the individual dictionaries are in a list implies that there might be more than one dictionary in each sub-list (otherwise why would you use a list?). Your code should account for that.

    The safest way to deal with this is to write a generator that iterates over both list levels and yields relevant results.

    For example:

    sentence= [[{'entity': 'adj', 'score': 0.9004535, 'index': 1, 'word': 'we', 'start': 0, 'end': 7}], [{'entity': 'verb', 'score': 0.8782018, 'index': 1, 'word': 'have', 'start': 0, 'end': 6}], [{'entity': 'verb', 'score': 0.9984743, 'index': 1, 'word': 'become', 'start': 0, 'end': 3}], [{'entity': 'noun', 'score': 0.9953852, 'index': 1, 'word': 'see', 'start': 0, 'end': 6}]]
    # ignore any entities given in the second argument (list)
    def extract(_list, ignore):
        for element in _list:
            for _dict in element:
                if _dict.get('entity') not in ignore:
                    yield _dict.get('word')
    
    for word in extract(sentence, ['verb', 'prep']):
        print(word)
    

    Output:

    we
    see