Search code examples
pythonlisttexttext-mining

Function to find word in list and then print following 50 lines


I have a gigantic txt file that I read in and cleaned into a list.
I'm looking for certain words, so I wrote a quick function

def find_words(lines):

    for line in lines:
        if "my words" in line:
            print(line)

which works fine, but how would I write the function so that it prints the word, plus the following next 50 lines or so? Summarizing, I want to find the text that comes after that word.

From then, I would want to create an empty df, and have the function fill in the df with a new row with the word + next 50 rows, every time it found that word.


Solution

  • Quick & dirty solution:

    for i, line in enumerate(lines):
        if "my words" in line:
            print(*lines[i:i+50], sep="\n")
    
    • enumerate will set i to the index of the current iterated line on the lines array
    • when your desired line is found, you print out a slice of the lines array from the current index, until 50 forward positions.
    • print each line separated by a \n (line break)

    If your document has a huge number of lines, you might want to avoid loading all the lines at once in memory (check https://stackoverflow.com/a/48124263/11245195 - but the workaround for your problem might be different).