Search code examples
pythonreadline

how to read a file not from the beginning nor until the end


I'm very new to python, and I'm stumbling upon a quite trivial issue.

I have a file called myfile.txt that looks like this:

some lines
that
I don't really need
and that 
can be skipped
START FROM THE NEXT ONE
  555555555555555
  555999999999999
  555333333333333
  555111111111111
  555333333333333
****
I don't need 
any 
of these 

I want to print only the lines after 'START FROM THE NEXT ONE' up until the '****' (excluded). The file contents may change over time, so I can't rely on the line number.

I came out with the following:

lines = open("myfile.txt","r").readlines() 

#solution n 1
for x, line in enumerate(lines):                                      
  if 'START FROM THE' in line:
    for j in range(x+1,len(lines)):
      if '****' in lines[j]:
        break  
      print(lines[j])
    break


#solution n 2
startWriting= False                                  
for line in lines:
    if startWriting:
        if '****' in line:
            break
        print(line)
    elif 'START FROM THE' in line:
        startWriting = True

They both work, but they are ugly. I'm wondering: is there a better solution to do this? Something not so wordy, with fewer conditions and nested loops? A cleaner and faster way?

I also tried this:

#solution n 3
wanted = [x for x in lines if x.startswith('  5')]     #THIS WORKS
for line in wanted:
    print(line)

But I'm reluctant to base my selection on the first char. What if they do not start with '5' anymore? I'm screwed!

Thanks for your patience and support. Any answer you want to throw at me will be highly regarded and very much appreciated.


Solution

  • if the start and end identifiers are always present you can use the split method to print the lines needed.

    1. split the text into a list of two strings, at the start identifier, then the second string in the list would contain the text needed and the the end identifier
    2. we split this again, this time with end identifier and take the first element from the new list to get the text needed

    Code

    read_text = open("myfile.txt", "r").read()
    
    print(read_text.split('START FROM THE NEXT ONE')[1].split('****')[0])
    
    

    Output

    
      555555555555555
      555999999999999
      555333333333333
      555111111111111
      555333333333333
    
    

    if the start or end identifiers could be absent you can use the try catch method to catch the exception:

    read_text = open("myfile2.txt", "r").read()
    try:
        print(read_text.split('START FROM THE NEXT ONE')[1].split('****')[0])
    except IndexError:
        print('start or end identifiers absent')