Search code examples
pythonpython-3.xfor-loopgeneratorstrip

Python: Parsing multi line-records in a for loop with str.rstrip() using generators


I need to strip new line characters in a huge file f I am parsing two lines at a time. e.g. like this.

def foo(f):
  with open(f, "r") as f:
    for n, s in zip(f, f):
      #something with `n` and `s`

Is it possible to str.rstrip directly in the for loop line or do I need to do it separately within the for loop body. This does not work for n, s in zip(f.rstrip(), f.rstrip()):

(Question updated to make it more concise)

UPDATE:

These solutions from Barmar's answer and comments from PaulCornelius below:

def foo2(f):
  with open(f, "r") as f:
    for n, s in zip ((line.rstrip() for line in f), (line.rstrip() for line in f)): 
      #Do something with `n` and `s`

and

def foo3(f):
  with open(f, "r") as f:
    g = (line.rstrip() for line in f)
    for n, s in zip(g, g):
      #Do something with `n` and `s`

UPDATE 2:

If you want to parse several files one can make one generator per file (here two files):

def foo4(f1, f2):
  with open(f1, "r") as f1, open(f2, "r") as f2:
    g1, g2 = (line.rstrip() for line in f1), (line.rstrip() for line in f2)
    for n1, s1, n2, s2 in zip(g1, g1, g2, g2):
      #Do something with `n1`, `s2, `n2` and `s2`

Solution

  • You can use list comprehensions.

    for n1, n2 in zip([line.rstrip() for line in f1], [line.rstrip() for line in f1]):
    

    But this doesn't process the file two lines at a time. Each list comprehension processes the whole file, so n1 and n2 will be copies of the same lines.

    I think you can get what you want using a generator expression instead, since they're lazy:

    for n1, n2 in zip ((line.rstrip() for line in f1), (line.rstrip() for line in f1)):