We have a old application that needs to be migrated to newer framework in AWS. In old application we used to have some expression based syntax, which was used to identify the client's permissions, expiry dates etc.
I am trying to convert old syntax in newer syntax which is comma separated syntax.
I am trying to use pyparsing
library for achieving this, but I feel like I am hitting a wall here. So far below codes gives me a list
breakdown of old code, But when there is nested loop in old code, I am unable to parse it.
If (LAST_RUN_DATE>=dat('JUL 01, 90'))
If (pos(con('*',SUB_CODE,'*'),'*ABC*DEF*ASD*WQR*')>=1)
Calculate Client as 1
End If
End If
in_string = '''If (LAST_RUN_DATE>=dat('JUL 01, 90'))
If (pos(con('*',SUB_CODE,'*'),'*ABC*DEF*ASD*WQR*')>=1)
Calculate Client as 1
End If
End If'''
from pyparsing import *
#- define basic punctuation and data types
LBRACE,RBRACE,LPAREN,RPAREN,SEMI = map(Suppress,"{}();")
IF = Keyword("If")
END_IF = Keyword("End If")
after_if = Regex(r'(.*?)\n')
_if = Forward()
#- _if << Group(IF + Group(after_if))
_if << Group(Group(ZeroOrMore(IF)) + Group(ZeroOrMore(after_if)) + Group(ZeroOrMore(END_IF)))
#- parse the sample text
result = _if.parseString(in_string)
#- print out the tokens as a nice indented list using pprint
from pprint import pprint
pprint(result.asList())
### Output
[[['If'],
["(LAST_RUN_DATE>=dat('JUL 01, 90'))\n",
"If (pos(con('*',SUB_CODE,'*'),'*ABC*DEF*ASD*WQR*')>=1)\n",
'Calculate Client as 1\n',
'End If\n'],
['End If']]]
I have taken reference from this and also expecting output similar to this. Link
You are definitely on the right track using Forward. Where you are going off is where you are trying to implement in ZeroOrMore's what the Forward's recursion will do for you.
Replace:
_if << Group(Group(ZeroOrMore(IF)) + Group(ZeroOrMore(after_if)) + Group(ZeroOrMore(END_IF)))
with:
_if << Group(IF + after_if + Group(ZeroOrMore(_if | after_if)) + END_IF)
You'll also have to be careful not to read an END_IF as an after_if. This will do it:
after_if = ~END_IF + Regex(r'(.*?)\n')
With these changes, I get:
[['If',
"(LAST_RUN_DATE>=dat('JUL 01, 90'))\n",
[['If',
"(pos(con('*',SUB_CODE,'*'),'*ABC*DEF*ASD*WQR*')>=1)\n",
['Calculate Client as 1\n'],
'End If']],
'End If']]
You might also consider being a little more explicit about after_if
vs non-if statements (which currently are all treated as after_ifs):
condition = originalTextFor(nestedExpr("(", ")"))
_if << Group(IF + condition + Group(ZeroOrMore(_if | after_if)) + END_IF)
In case your syntax allows for the if condition to span newlines.
Also, check out Yelp!'s undebt project for converting code (based on pyparsing).