Consider the following Python snippet:
af=open("a",'r')
bf=open("b", 'w')
for i, line in enumerate(af):
if i < K:
bf.write(line)
Now, suppose I want to handle the case where K
is None
,
so the writing continues to the end of the file.
I'm currently doing
if K is None:
for i, line in enumerate(af):
bf.write(line)
else:
for i, line in enumerate(af):
bf.write(line)
if i==K:
break
This clearly isn't the best way to handle this, as I'm duplicating the code.
Is there some more integrated way I can handle this? The natural thing would be
to have the if/break
code only be present if K
is not None
,
but this involves writing syntax on the fly a la Lisp macros,
which Python can't really do. Just to be clear, I'm not concerned about the particular
case (which I choose partly for its simplicity), so much as learning about general
techniques I may not be familar with.
UPDATE: After reading answers people have posted, and doing more experimentation, here are some more comments.
As said above, I was looking for general techniques that would be generalizable, and I think @Paul's answer,namely using takewhile
from iterrools
, fits that best. As a bonus, it is also much faster than the naive method i listed above; I'm not sure why. I'm not really familar with itertools
, though I've looked at it a few times. From my perspective this is a case of functional programming For The Win! (Amusingly, the author of itertools
once asked for feedback about dropping takewhile
. See the thread beginning http://mail.python.org/pipermail/python-list/2007-December/522529.html.) I'd simplified my situation above, the actual situation is a bit more messy - I'm writing to two different files in the loop. So the code looks more like:
for i, line in enumerate(af):
if i < K:
bf.write(line)
cf.write(line.split(',')[0].strip('"')+'\n')
Given my posted example, @Jeff reasonably suggested that in the case when K
was None
, I just copy the file. Since in practice I am looping anyway, doing so is not such a clear choice. However, takewhile
generalizes painlessly to this case. I also had another use case I did not mention here, and was able to use takewhile
there too, which was nice. The second example looks like (verbatim)
i=0
for line in takewhile(illuminacond, af):
line_split=line.split(',')
pid=line_split[1][0:3]
out = line_split[1] + ',' + line_split[2] + ',' + line_split[3][1] + line_split[3][3] + ',' \
+ line_split[15] + ',' + line_split[9] + ',' + line_split[10]
if pid!='cnv' and pid!='hCV' and pid!='cnv':
i = i+1
of.write(out.strip('"')+'\n')
tf.write(line)
here I was able to use the condition
if K is None:
illuminacond = lambda x: x.split(',')[0] != '[Controls]'
else:
illuminacond = lambda x: x.split(',')[0] != '[Controls]' and i < K
per @Paul's original example. However, I'm not completely happy about the fact that I'm getting i
from the outer scope, though the code works. Is there a better way of doing this? Or maybe it should be a separate question. Anyway, thanks to everyone who answered my question. Honorable mention to @Jeff, who made some nice suggestions.
itertools.takewhile
will apply your condition, and then break out of the loop the first time the condition fails.
from itertools import takewhile
if K is None:
condition = lambda x: True
else:
condition = lambda x: x[0] < K
for i,line in takewhile(condition, enumerate(af)):
bf.write(line)
If K is None, then you don't want takewhile to ever stop, so the condition function should always return True. But if you are given a numeric value for K, then once the 0'th element of the tuple passed to the condition >= K, then takewhile will stop.