I'm building a "xpath parsing tool" in Python for my team. In my case, the xpath script is not the normal xpath, the syntax that user input will be in a special struct, here is an example:
The input format will be like: (the element can be tuple-type or normal element)
sig = "(xpath_1_1, xpath_1_2), (xpath_2_1, xpath_2_2), xpath_3..."
which is edited in excel by users
And my goal is to parse the string into a list-type data with tuple or normal element:
[(xpath_1_1, xpath_1_2), (xpath_2_1, xpath_2_2), xpath_3...]
Then I can input this data into my selenium to snapshot img sequentially.
Here is one of my testing data:
sig = "(//div[@style='font-family:Arial;float: left;width:930px;font-size:12px;' and ./span[contains(text(),'005930')]], //table[@id='gv_flow_krKS0 1']),//table[@id='123456'],(//div[@style='font-family:Arial;float: left;width:930px;font-size:12px;' and ./span[contains(text(),'000660')]], //table[@id='gv_flow_krKS0 2']),//table[@id='456789']"
I'm wondering is there any better way to implement this func without disrupting the order of list ?
First , I think eval() func is not a good idea since it may cause some security prob.
Now I'm trying to use re lib to solve it.
However I found it's quite difficult and have no idea how to start.
Anyone can help ? Thanks~
OK, I think this does what you want. You should try some different test strings.
sig = "(//div[@style='font-family:Arial;float: left;width:930px;font-size:12px;' and ./span[contains(text(),'005930')]], //table[@id='gv_flow_krKS0 1']),//table[@id='123456'],(//div[@style='font-family:Arial;float: left;width:930px;font-size:12px;' and ./span[contains(text(),'000660')]], //table[@id='gv_flow_krKS0 2']),//table[@id='456789']"
gather = ''
element = []
elements = []
state = ''
for c in sig:
if state:
gather += c
if c == state:
state = ''
continue
if c == '(':
in_tuple = True
continue
elif c == ')':
in_tuple = False
element.append( gather )
gather = ''
elements.append(tuple(element))
element = []
continue
elif c == ',':
if in_tuple:
element.append( gather )
else:
elements.append( gather )
gather = ''
continue
elif c == '[':
state = ']'
elif c == "'":
state = "'"
gather += c
# Handle leftover.
if element:
elements.append( element )
for e in elements:
print( e )
Output:
("//div[@style='font-family:Arial;float: left;width:930px;font-size:12px;' and ./span[contains(text(),'005930')]]", " //table[@id='gv_flow_krKS0 1']")
//table[@id='123456']
("//div[@style='font-family:Arial;float: left;width:930px;font-size:12px;' and ./span[contains(text(),'000660')]]", " //table[@id='gv_flow_krKS0 2']")