I have the following list:
[
['the', 'the +Det'],
['dog', 'dog +N +A-right'],
['ran', 'run +V +past'],
['at', 'at +P'],
['me', 'I +N +G-left'],
['and', 'and +Cnj'],
['the', 'the +Det'],
['ball', 'ball +N +G-right'],
['was', 'was +C'],
['kicked', 'kick +V +past']
['by', 'by +P']
['me', 'I +N +A-left']
]
Basically, what I'm looking to do is:
+G-left
, +A-left
, +G-right
, and +A-right
+G-left
or +A-left
is seen, look backward to the first instance of a list with the element +V
add the first index of the list containing +G-left
or +A-left
to the end of the list containing +V
with the +G-left
or +A-left
tag, then move on and repeat+G-right
or +A-right
is seen, look forward to the first instance of a list with the element +V
add the first index of the list containing +G-right
or +A-right
to the end of the list containing +V
with the +G-right
or +A-right
tag, then move on and repeatSo in the case of my above example, the desired states would be:
[
['the', 'the +Det'],
['dog', 'dog +N +A-right'],
['ran', 'run +V +past', 'dog+A-right', 'me+G-left'],
['at', 'at +P'],
['me', 'I +N +G-left'],
['and', 'and +Cnj'],
['the', 'the +Det'],
['ball', 'ball +N +G-right'],
['was', 'was +C'],
['kicked', 'kick +V +past', 'ball+G-right', 'me+A-left']
['by', 'by +P']
['me', 'I +N +A-left']
]
I think the proper way to approach this is with re
, so:
gleft = re.compile(r"G-left")
gright = re.compile(r"G-right")
aleft = re.compile(r"A-left")
aright = re.compile(r"A-right")
then something like
for item in list:
if aleft.match(item[1]):
somehow work backwards to find the +V tag
whatever.insert(-1, item[0]) #can you concatenate a string here to add +A-left
if aright.match(item[1]):
somehow work forwards to find the +V tag
whatever.insert(-1, item[0]) #can you concatenate a string here to add +A-right
And the same thing but with the G tags.
Hopefully someone can help point me in the right direction. I believe I've broken down the steps correctly, I just am not familiar enough with Python to yet know the syntax for this off the top of my head.
This can probably somewhat simplified by using an auxiliary function, but that aside, try this, which doesn't require regex:
wls = [your list of lists, above, fixed (some commas are missing)]
for wl in wls:
for w in wl:
if '-right' in w:
targ = wls.index(wl)
counter = 0
for wt in (wls[targ+1:]):
for t in wt:
if '+V' in t:
if counter<1:
wt.insert(len(wt),wl[0]+w.split(' ')[-1])
counter+=1
if '-left' in w:
targ = wls.index(wl)
counter = 0
revd = [item for item in reversed(wls[:targ])]
for wt in revd:
for t in wt:
if '+V' in t:
if counter<1:
wt.insert(len(wt),wl[0]+w.split(' ')[-1])
counter+=1
wls
The output should be what you are looking for.