Currently I'm trying to build syntax-aware NMT model.
In this project, I need the sequence of one of three transition actions (SHIFT, REDUCE-L, REDUCE-R)
Similar to what is in the image a
This chunk represents the transition-based dependency for 2 sentences(1 for 1 chunk split by empty lines)
I'm using Syntaxnet
to get the dependency parse tree first, but it doesn't directly provide that transition action sequences.
It's results are as follows,
Is it possible to get the action sequences similar to this image? Is it possible to convert what is achieved from this image to the original image's format.
A function that converts a dependency tree to a sequence of transitions is called an oracle. It is a necessary component of a statistical parser. The transitions you described (shift, reduce-l, reduce-r)¹ are those of the arc-standard transition system (not the arc-eager system, which is: shift, left-arc, right-arc, reduce).
Pseudo-code for an arc-standard oracle:
stack = [] # stack[0] is the top of the stack
buffer = [w1, w2, ..., wn]
while len(buffer) > 0 or len(stack) != 1:
if stack[0] is the head of stack[1]:
apply left-arc
if stack[1] is the head of stack[0]:
if there is a token in the buffer whose head is stack[0]
# (i.e not all children of stack[1] have been attached to it)
apply shift
else:
apply right-arc
These slides present the two parsing algorithms and their oracles.
¹Reduce-left, reduce-right are often named right-arc and left-arc in the context of dependency parsing.