Search code examples
pythonfileioiteratorreadline

python read file, grapping lines with conditions


Say I have a file my_file from which I want certain lines, e.g. where each line output is a list element. I am trying to understand how to control and use the Python file i/o operations.

The file:

cat > my_file <<EOF
[Ignore_these] 
abc
234
[Wow]
123
321
def
[Take_rest]
ghi
jkl
EOF

Say, that after the line [Wow] I want to merge integer lines (could be any number of lines, here I get '123321') and ignore the rest until I meet [Take_rest] from where I want the remaining lines ('ghi' and 'jkl')- [Take_rest] is always the last section. So the resulting output is data = list('123321', 'ghi', 'jkl'). I tried something like the following but fail to understand how readline() and next() (etc) works.

def is_int(s):
    try:
        int(s)
        return True
    except ValueError:
        return False


with open('my_file', 'r') as f:
    data = []
    while True:
        line = f.readline()
        if '[Wow]' in line:
            wow = ''
            while is_int(next(f)):
                wow = ''.join([wow, line])
            data.append(wow)
        if '[Take_rest]' in line:
            data.append(next(f))

        if not line:
            break

Solution

  • Instead of complicating things - use the following approach:

    with open('input.txt') as f:
        data = ['']
        wow_flag = False
        for line in f:
            line = line.strip()
            if line.startswith('[Wow]'):   # detect `Wow` section start
                wow_flag = True
            elif line.startswith('[Take_rest]'):  # taking all the rest
                data.extend(list(line.strip() for line in f))
            if wow_flag and line.isdigit():   # capturing digits under `Wow` section
                data[-1] += line
    
    print(data)
    

    The output:

    ['123321', 'ghi', 'jkl']