Search code examples
pythontextiteratortext-searchsliding-window

Flexible sliding window (in Python)


Problem description: I'm interested in looking at terms in the text window of, say, 3 words to the left and 3 to the right. The base case has the form of w-3 w-2 w-1 term w+1 w+2 w+3. I want to implement a sliding window over my text with which I will be able to record the context words of each term. So, every word is once treated as a term, but when the window moves, it becomes a context word, etc. However, when the term is the 1st word in line, there are no context words on the left (t w+1 w+2 w+3), when it's the 2nd word in line, there's only one context word on the left, and so on. So, I am interested in any hints for implementing this flexible sliding window (in Python) without writing and specifying separately each possible situation.

To recap:

Example of input:

["w1", "w2", "w3", "w4", "w5", "w6", "w7", "w8", "w9", "w10"]

Output:

t1 w2 w3 w4

w1 t2 w3 w4 w5

w1 w2 t3 w4 w5 w6

w1 w2 w3 t4 w5 w6 w7

__ w2 w3 w4 t5 w6 w7 w8

__ __ etc.

My current plan is to implement this with a separate condition for each line in the output.


Solution

  • If you want a sliding window of n words, use a double-ended queue with maximum length n to implement a buffer.

    This should illustrate the concept:

    mystr = "StackOverflow"    
    from collections import deque    
    window = deque(maxlen=5)
    for char in mystr:
        window.append(char)
        print ( ''.join(list(window)) )
    

    Output:

    S
    St
    Sta
    Stac
    Stack
    tackO
    ackOv
    ckOve
    kOver
    Overf
    verfl
    erflo
    rflow