Search code examples
nlpsyntaxnetdependency-parsing

Converting Dependency tree into sequence of Arc-eager transitions


Currently I'm trying to build syntax-aware NMT model.
In this project, I need the sequence of one of three transition actions (SHIFT, REDUCE-L, REDUCE-R)

Similar to what is in the image a

This chunk represents the transition-based dependency for 2 sentences(1 for 1 chunk split by empty lines)

I'm using Syntaxnet to get the dependency parse tree first, but it doesn't directly provide that transition action sequences.
It's results are as follows,

b

Is it possible to get the action sequences similar to this image? Is it possible to convert what is achieved from this image to the original image's format.


Solution

  • A function that converts a dependency tree to a sequence of transitions is called an oracle. It is a necessary component of a statistical parser. The transitions you described (shift, reduce-l, reduce-r)¹ are those of the arc-standard transition system (not the arc-eager system, which is: shift, left-arc, right-arc, reduce).

    Pseudo-code for an arc-standard oracle:

    stack = [] # stack[0] is the top of the stack
    buffer = [w1, w2, ..., wn]
    
    while len(buffer) > 0 or len(stack) != 1:
        if stack[0] is the head of stack[1]:
            apply left-arc
        if stack[1] is the head of stack[0]:
            if there is a token in the buffer whose head is stack[0] 
                # (i.e not all children of stack[1] have been attached to it)
                apply shift
            else:
                apply right-arc
    

    These slides present the two parsing algorithms and their oracles.

    ¹Reduce-left, reduce-right are often named right-arc and left-arc in the context of dependency parsing.