Search code examples
pythonlistpyparsing

how to pass portions of text to tuple/list in python based on word


i have following sample text and need to pass all lines of text to tuple/list based on a word "ALL Banks Report".. the raw text as follows

%Bank PARSED MESSAGE FILE
%VERSION   : PIL 98.7
%nex MODULE   : SIL 98

2018 Jan 31  16:44:53.050 ALL Banks Report SBI
name id ID = 0,  ID = 58
    Freq = 180

    conserved NEXT:
      message c1 : ABC1 : 
          {
            XYZ2
           }
2018 Jan 31  16:44:43.050 ALL Banks Report HDFC
conserved LATE:

World ::= 
{
  Asia c1 : EastAsia : 
      {
        India
       }
}

...like so many repitions i want to pass tuple/List/array based on a word "ALL Banks Report" so that in list[0] the following goes

2018 Jan 31  16:44:53.050 ALL Banks Report SBI
name id ID = 0,  ID = 58
    Freq = 180

    conserved NEXT:
      message c1 : ABC1 : 
          {
            XYZ2
           }

and in list[1] the rest goes like below

2018 Jan 31  16:44:43.050 ALL Banks Report HDFC
conserved LATE:

World ::= 
{
  Asia c1 : EastAsia : 
      {
        India
       }
}

Solution

  • IMO, there's no particular advantage to the use of pyparsing here. It's easy to process this file using an old-fashioned algorithm.

    output_list = []
    items = []
    with open('spark.txt') as spark:
        for line in spark:
            line = line.rstrip()
            if line and not line.startswith('%'):
                if 'ALL Banks Report' in line:
                    if items:
                        output_list.extend(items)
                    items = [line]
                else:
                    items.append(line)
    if items:
        output_list.extend(items)
    
    for item in output_list:
        print (item)
    

    Output:

    2018 Jan 31  16:44:53.050 ALL Banks Report SBI
    name id ID = 0,  ID = 58
        Freq = 180
        conserved NEXT:
          message c1 : ABC1 :
              {
                XYZ2
               }
    2018 Jan 31  16:44:43.050 ALL Banks Report HDFC
    conserved LATE:
    World ::=
    {
      Asia c1 : EastAsia :
          {
            India
           }
    }
    

    Incidentally, I have avoided the use of list as an identifier, since it's a Python keyword.