Let's consider the following code I have created, which reflects my issue (following my previous question: How to parse groups with operator and brackets):
from pyparsing import *
line = 'a(1)->b(2)->c(3)->b(4)->a(5)'
LPAR, RPAR = map(Suppress, "()")
num = Word(nums)
SEQOP = Suppress('->')
a = Group(Literal('a')+LPAR+num+RPAR)('ela*')
b = Group(Literal('b')+LPAR+num+RPAR)('elb*')
c = Group(Literal('c')+LPAR+num+RPAR)('elc*')
element = a | b | c
one_seq_expr = Group(element + (SEQOP + element)[...])('one_seq_expr')
out = one_seq_expr.parseString(line)
print(out.dump())
From this code I obtain the following results:
[[['a', '1'], ['b', '2'], ['c', '3'], ['b', '4'], ['a', '5']]]
- one_seq_expr: [['a', '1'], ['b', '2'], ['c', '3'], ['b', '4'], ['a', '5']]
- ela: [['a', '1'], ['a', '5']]
[0]:
['a', '1']
[1]:
['a', '5']
- elb: [['b', '2'], ['b', '4']]
[0]:
['b', '2']
[1]:
['b', '4']
- elc: [['c', '3']]
[0]:
['c', '3']
We can access the results in different ways:
>> out[0]
([(['a', '1'], {}), (['b', '2'], {}), (['c', '3'], {}), (['b', '4'], {}), (['a', '5'], {})], {'ela': [(['a', '1'], {}), (['a', '5'], {})], 'elb': [(['b', '2'], {}), (['b', '4'], {})], 'elc': [(['c', '3'], {})]})
>> out['one_seq_expr']
([(['a', '1'], {}), (['b', '2'], {}), (['c', '3'], {}), (['b', '4'], {}), (['a', '5'], {})], {'ela': [(['a', '1'], {}), (['a', '5'], {})], 'elb': [(['b', '2'], {}), (['b', '4'], {})], 'elc': [(['c', '3'], {})]})
>> out['one_seq_expr'][0:4]
[(['a', '1'], {}), (['b', '2'], {}), (['c', '3'], {}), (['b', '4'], {})]
>> for _ in out[0]: print(_)
['a', '1']
['b', '2']
['c', '3']
['b', '4']
['a', '5']
>> out['one_seq_expr']['ela']
([(['a', '1'], {}), (['a', '5'], {})], {})
The ParseResults object out['one_seq_expr']
keeps the order of the different tokens found. On the other hand the structure of the named tokens is grouping them by name and keeps the order of appearance for each name.
Is it possible to obtain an output structure where the order is kept between different elements while keeping the name in a certain form? Something like:
- one_seq_expr: [['a', '1'], ['b', '2'], ['c', '3'], ['b', '4'], ['a', '5']]
- ela_0: [['a', '1']]
[0]:
['a', '1']
- elb_0: [['b', '2']]
[0]:
['b', '2']
- elc_0: [['c', '3']]
[0]:
['c', '3']
- elb_1: [['b', '4']]
[0]:
['b', '4']
- ela_0: [['a', '5']]
[0]:
['a', '5']
Or do we have to use ParseResults.getName()
on the ordered list of tokens out['one_seq_expr']
? Such as:
>> [_.getName() for _ in out['one_seq_expr']]
['ela', 'elb', 'elc', 'elb', 'ela']
You could use a parse action to annotate these elements with their respective types, and these would be retained with each element:
a.addParseAction(lambda t: t[0].insert(0, "ELA_TYPE"))
b.addParseAction(lambda t: t[0].insert(0, "ELB_TYPE"))
c.addParseAction(lambda t: t[0].insert(0, "ELC_TYPE"))
Parsing with these expressions and dumping the results gives (manually reformatted):
- one_seq_expr: [['ELA_TYPE', 'a', '1'],
['ELB_TYPE', 'b', '2'],
['ELC_TYPE', 'c', '3'],
['ELB_TYPE', 'b', '4'],
['ELA_TYPE', 'a', '5']]
... etc. ...