Search code examples
pythontext-processing

How can I loop through blocks of lines in a file?


I have a text file that looks like this, with blocks of lines separated by blank lines:

ID: 1
Name: X
FamilyN: Y
Age: 20

ID: 2
Name: H
FamilyN: F
Age: 23

ID: 3
Name: S
FamilyN: Y
Age: 13

ID: 4
Name: M
FamilyN: Z
Age: 25

How can I loop through the blocks and process the data in each block? eventually I want to gather the name, family name and age values into three columns, like so:

Y X 20
F H 23
Y S 13
Z M 25

Solution

  • Here's another way, using itertools.groupby. The function groupy iterates through lines of the file and calls isa_group_separator(line) for each line. isa_group_separator returns either True or False (called the key), and itertools.groupby then groups all the consecutive lines that yielded the same True or False result.

    This is a very convenient way to collect lines into groups.

    import itertools
    
    def isa_group_separator(line):
        return line=='\n'
    
    with open('data_file') as f:
        for key,group in itertools.groupby(f,isa_group_separator):
            # print(key,list(group))  # uncomment to see what itertools.groupby does.
            if not key:               # however, this will make the rest of the code not work
                data={}               # as it exhausts the `group` iterator
                for item in group:
                    field,value=item.split(':')
                    value=value.strip()
                    data[field]=value
                print('{FamilyN} {Name} {Age}'.format(**data))
    
    # Y X 20
    # F H 23
    # Y S 13
    # Z M 25