Search code examples
pythonlistfileindexingreadlines

Read specific sequence of lines in Python


I have a sample file that looks like this:

    @XXXXXXXXX
    VXVXVXVXVX
    +
    ZZZZZZZZZZZ
    @AAAAAA
    YBYBYBYBYBYBYB
    ZZZZZZZZZZZZ
    ...

I wish to only read the lines that fall on the index 4i+2, where i starts at 0. So I should read the VXVXV (4*0+2 = 2)... line and the YBYB...(4*1 +2 = 6)line in the snippet above. I need to count the number of 'V's, 'X's,'Y's and 'B's and store in a pre-existing dict.

fp = open(fileName, "r")
lines = fp.readlines()

for i in xrange(1, len(lines),4):
    for c in str(lines(i)):
        if c == 'V':
             some_dict['V'] +=1

Can someone explain how do I avoid going off index and only read in the lines at the 4*i+2 index of the lines list?


Solution

  • Can't you just slice the list of lines?

    lines = fp.readlines()
    interesting_lines = lines[2::4]
    

    Edit for others questioning how it works:

    The "full" slice syntax is three parts: start:end:step

    The start is the starting index, or 0 by default. Thus, for a 4 * i + 2, when i == 0, that is index #2.

    The end is the ending index, or len(sequence) by default. Slices go up to but not including the last index.

    The step is the increment between chosen items, 1 by default. Normally, a slice like 3:7 would return elements 3,4,5,6 (and not 7). But when you add a step parameter, you can do things like "step by 4".

    Doing "step by 4" means start+0, start+4, start+8, start+12, ... which is what the OP wants, so long as the start parameter is chosen correctly.