Search code examples
pythontext-extraction

Extract lines from a txt file in python


Imagine I have this text in a txt file:

bla bla bla
bla bla bla
Title Lorem ipsum dolor sit amet, consectetur adipiscing
elit, sed do eiusmod tempor incididunt ut labore et dolore
magna aliqua. Ut enim ad minim veniam,
condition
bla bla bla
bla bla bla
Title Sed ut perspiciatis unde omnis iste natus error sit voluptatem
accusantium doloremque laudantium, totam rem aperiam,
eaque ipsa quae ab illo inventore veritatis
condition
bla bla bla

From the text with the structure above (hundred of lines), I want to extract the lines that start with 'title' until I find the line that starts with the word 'condition'. So the result would be something like this:

Title Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam,

Title Sed ut perspiciatis unde omnis iste natus error sit voluptatem accusantium doloremque laudantium, totam rem aperiam, eaque ipsa quae ab illo inventore veritatis

I get to select the first like with this code, but I don't know how to add the next lines until I find the word 'condition'. Could you help me, please?

outF = open("myOutFile.txt", "w")
hand = open('doubt.txt', encoding="utf8")
for line in hand:
    line = line.rstrip()
    if re.search('^Title',line) :       
       outF.write(line); outF.write("\n")
       outF.write("\n")
outF.close()```

Solution

  • In case you want all titles until the first condition line appears, you need to break the loop:

    for line in hand:
        line = line.rstrip()
        if line.startswith("Title"):       
           outF.writelines([line])
        if line.startswith("condition"):
             break
    
    outF.close()
    

    In case you want to write all lines after a title till the next condition appears:

    write = False
    writelines = []
    
    for line in hand:
        line = line.rstrip()
        
        if line.startswith("condition"):
           write = False
           writelines.append("\n")
           
        if line.startswith("Title"):       
           write = True
        
        if write:
             writelines.append(line + " ")
    
    outF.writelines(writelines)  
    outF.close()