I have a sample file that looks like this:
@XXXXXXXXX
VXVXVXVXVX
+
ZZZZZZZZZZZ
@AAAAAA
YBYBYBYBYBYBYB
ZZZZZZZZZZZZ
...
I wish to only read the lines that fall on the index 4i+2, where i starts at 0. So I should read the VXVXV (4*0+2 = 2)...
line and the YBYB...(4*1 +2 = 6)
line in the snippet above. I need to count the number of 'V's, 'X's,'Y's and 'B's
and store in a pre-existing dict.
fp = open(fileName, "r")
lines = fp.readlines()
for i in xrange(1, len(lines),4):
for c in str(lines(i)):
if c == 'V':
some_dict['V'] +=1
Can someone explain how do I avoid going off index and only read in the lines at the 4*i+2 index of the lines list?
Can't you just slice the list of lines?
lines = fp.readlines()
interesting_lines = lines[2::4]
Edit for others questioning how it works:
The "full" slice syntax is three parts: start:end:step
The start
is the starting index, or 0 by default. Thus, for a 4 * i + 2, when i == 0, that is index #2.
The end
is the ending index, or len(sequence)
by default. Slices go up to but not including the last index.
The step
is the increment between chosen items, 1 by default. Normally, a slice like 3:7
would return elements 3,4,5,6 (and not 7). But when you add a step
parameter, you can do things like "step by 4".
Doing "step by 4" means start+0, start+4, start+8, start+12, ...
which is what the OP wants, so long as the start
parameter is chosen correctly.