pyparsing to group matched string and unmatched stings in the same order the input text

I have a problem in parsing my expression string. I want to identify all the identifiers from the input string using pyparsing.

identifier=pyparsing_common.identifier

My input string is

identifier.parseString('1+2*xyz*abc/5')

I want the below as output

[['1+2*'],['xyz'],['*'],['abc'],['/5']]

Can anyone please help me how to achieve this?

Thanks in advance

Solution

Here are a number of different code samples to show some alternative ways to tackle your problem (using pyparsing version 2.4.7).

Using your definitions of input_string and identifier:

>>> input_string = "1+2*xyz*abc/5"
>>> identifier = pp.pyparsing_common.identifier

Using identifier.split() (similar to re.split) to get the parts of the input string:

>>> print(list(identifier.split(input_string, includeSeparators=True)))
['1+2*', 'xyz', '*', 'abc', '/5']

Using identifier.searchString() to return a ParseResults for each match:

>>> print(identifier.searchString(input_string))
[['xyz'], ['abc']]

Using the sum() built-in to combine the matches into a single ParseResults:

>>> print(sum(identifier.searchString(input_string)))
['xyz', 'abc']

Using the locatedExpr helper method to wrap identifier, so that each match produces a group containing the matched value, plus the start and end locations:

>>> print(sum(pp.locatedExpr(identifier).searchString(input_string)))
[[4, 'xyz', 7], [8, 'abc', 11]]

Using dump() to show the values as a list, and the named results in each subgroup:

>>> print(sum(pp.locatedExpr(identifier).searchString(input_string)).dump())
[[4, 'xyz', 7], [8, 'abc', 11]]
[0]:
  [4, 'xyz', 7]
  - locn_end: 7
  - locn_start: 4
  - value: 'xyz'
[1]:
  [8, 'abc', 11]
  - locn_end: 11
  - locn_start: 8
  - value: 'abc'