Search code examples
pythontreenltknonblocking

NLTK draw tree in non-blocking way


NLTK provides a feature that allows you to "draw" tree structures, e.g. a dependency parse. In practice, when you call tree.draw(), a windows will pop up (on Windows at least) with the drawn tree. Even though this is nice functionality, it also blocks, meaning that the execution of a script is blocked when the tree is drawn until you close the window of the newly drawn tree.

Is there any way to draw trees in a non-blocking way, i.e. without them stopping a script's execution? I have thought about starting a separate process in Python that is responsible for drawing the trees, but perhaps there is a more straightforward way.


Solution

  • NLTK uses a Tkinter canvas to display tree structures. Tkinter has a mainloop method which makes it wait for events and update the GUI. But this method is blocks the code after it (more about this here and here).
    Instead of the mainloop method we can use the update method which is non-blocking. It updates the Tkinter canvas and then returns.
    Here is how we can do this with NLTK:

    import nltk
    from nltk import pos_tag
    pattern = """NP: {<DT>?<JJ>*<NN>}
    ... VBD: {<VBD>}
    ... IN: {<IN>}"""
    NPChunker = nltk.RegexpParser(pattern)
    
    sentences = ['the small dog is running',
                 'the big cat is sleeping',
                 'the green turtle is swimming'
                ]
    
    def makeTree(sentence):
        tree = NPChunker.parse(pos_tag(sentence.split()))
        return(tree)
    
    from nltk.draw.util import Canvas
    from nltk.draw import TreeWidget
    from nltk.draw.util import CanvasFrame
    
    cf = CanvasFrame()
    
    for sentence in sentences:
        tree = makeTree(sentence)
        tc = TreeWidget(cf.canvas(), tree)
        cf.add_widget(tc)
        cf.canvas().update()
    
    
    ## .. the rest of your code here
    print('this code is non-blocking')
    
    #at the end call this so that the programm doesn't terminate and close the window
    cf.mainloop()