Search code examples
pythonpython-2.7logfilereadlines

How to read previous lines in Python relative to a search found in the log file?


I am a newbie to Python so just trying things with it.
I have a huge file , where after searching for a search phrase ,I should go back by n lines and get the start of the text, start tag . After that start reading from that position .

The phrases can occur multiple times . And there are multiple start tags. Please find the sample file as below:

<module>
hi
flowers
<name>xxx</name>
<age>46</age>
</module>
<module>
<place>yyyy</place>
<name>janiiiii</janii>
</module>

Assume the search is , and I need to go back to the line once I search the . The lines between & will vary , they are not static. So once I find the name I need to go back to the module line and start reading it .

Please find the below code:

from itertools import islice
lastiterline=none
line_num=0
search_phrase="Janiii"
with open ('c:\sample.txt',"rb+") as f:
      for line in f:
          line_num+=1
     line=line.strip()
        if line.startswith("<module>"):
           lastiterline=line
           linec=line_num
        elif line find(search_phrase)>=0:
             if lastiterline:
             print line
             print linec

This helps me to get the line number of the module corresponding to the word searched.But I am unable to move back the pointer to start reading the lines again from module. There will be multiple search phrases, So everytime I need to go back to that line without breaking the main for, which reads the entire huge file.

For eg :there may be 100 modules tags , and inside that I might have 10 search phrases which I want , so I just need those 10 module tags .


Solution

  • Ok here is an example for you, so you can be more specific with what you need.

    This is a sample of your huge_file.txt:

    wgoi jowijg
    <start tag>
    wfejoije jfie
    fwjoejo
    THE PHRASE
    jwieo
    <end tag>
    wjefoiw wgworjg
    <start tag>
    wjgoirg 
    <end tag>
    <start tag>
    wfejoije jfie
    fwjoejo
    woeoj
    jwieo
    THE PHRASE
    <end tag>
    

    And a script read_prev_lines.py:

    hugefile = open("huge_file.txt", "r")
    hugefile = hugefile.readlines()
    
    start_locations = []
    current_block = -1
    for idx, line in enumerate(hugefile):
      if "<start tag>" in line:
        start_locations.append({"start": idx})
        current_block += 1
      if "THE PHRASE" in line:
        start_locations[current_block]["phr"] = idx
      if "<end tag>" in line:
        start_locations[current_block]["end"] = idx
    
    #for i in phrase_locations:
    for idx in range(len(start_locations)):
      if "phr" in start_locations[idx].keys():
        print("Found THE PHRASE after %d start tag(s), at line %d:" % (idx, start_locations[idx]["phr"]))
        print("Here is the whole block that contains the phrase:")
        print(hugefile[start_locations[idx]["start"]: start_locations[idx]["end"]+1])