Search code examples
pythonlistfor-looprangedel

IndexError: list assignment index out of range, deletion


I am trying to make a program that loops through a list of headlines and remove items that have similar headlines in the rest of the list.

# Loop through headlines and remove over 50% similar ones
headlines = listHeadlines()
# headlines.append('Our plan is working says Hunt, as Bank raises interest rate to 5.25%')
print(len(headlines), headlines)
headlines_copy = list(headlines)
for headline in headlines_copy:
    for h in headlines_copy:
        if h == headline:
            pass
        elif areStringsSimilar(h, headline):
            del headlines[headlines_copy.index(headline)]
            break  # Exit this loop and move back to other because headline has been deleted from list.

print(len(headlines), headlines)

The first print(len(headlines), headlines) works and prints 1248 [list] but then I get the error:

Traceback (most recent call last):
  File "/Users/[path]/main.py", line 95, in <module>
    del headlines[headlines_copy.index(headline)]
        ~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
IndexError: list assignment index out of range

Process finished with exit code 1

Solution

  • Why not append the headlines you want to keep rather than deleting the headlines you don't want:

    headlines = listHeadlines()
    deduplicated = []
    for candidate in headlines:
        if not any(map(lambda kept: areStringsSimilar(candidate, kept), deduplicated)):
            deduplicated.append(candidate)
    print(deduplicated)