Search code examples
pythontext-parsing

Parse text for matching key then grab first set of matching table name rows


Trying to do some reconciliations with some large old flat text files(that are honestly messes). Issue I am have is that I find my matching key, I am trying to grab the first set consecutive of rows with a matching table names and ignoring the rest. How would I read what I need and not the rest? Playing around with breaks but the logic is escaping me.
Example: If I was looking for a PK of 101 and table name of drink, from the below list I want to print

drink 25
drink 26

FlatTextFile.txt
pk_tbl 23 100
food 0 0
drink 0 0
dessert 0 0

pk_tbl 101
food 0
drink 25
drink 26
dessert 0
drink 27
drink 28
drink 29

pk_tbl 102
food 0
drink 0
drink 0
drink 0
dessert 0

psuedo code for the example of where I am at essentially

        pk_flag = 0
            for row in d:
                if (row[0]= 'drink') and (pk_flag =='1'):
                    print(row)                    
                if (row[0]= 'pk_tbl')and (row[2] =='101'):
                    pk_flag = 1;
                elif (row[0]= 'pk_tbl')and (row[2] !='101'):
                    pk_flag = 0;

A little confusing haha, any help is appreciated. Thanks!


Solution

  • def get_table_data(file_path = 'FlatTextFile.txt', table_keyword = 'pk_tbl', table_num = '101', data_keyword = 'drink'):
        output_ls = []
        with open(file_path, 'r') as fh:
            table = False
            data = False
            for line in fh.readlines():
                if not len(line.strip()): # Ignoring blank lines
                    continue
                row = line.split()
                if not table: # Searching for table keyword and number
                    if row[0] == table_keyword and row[1] == table_num:
                        table = True
                else:
                    if row[0] == table_keyword: # I'm already at next table
                        break
                    if not data: # Searching for data keyword
                        if row[0] == data_keyword:
                            data = True
                            output_ls.append(line)
                    else: # Searching for more consecutive data keywords
                        if row[0] == data_keyword:
                            output_ls.append(line)
                        else:
                            break
            return output_ls