I need to extract a portion of text from a txt file.
The file looks like this:
STARTINGWORKIN DD / MM / YYYY HH: MM: SS
... text lines ...
... more text lines ...
STARTINGWORKING DD / MM / YYYY HH: MM: SS
... text lines I want ...
... more text lines that I want ...
I tried use 3 for loops (one to start, another read the between line, and the last to end)
file = "records.txt"
if file.endswith (".txt"):
if os.path.exists (file):
lines = [line.rstrip ('\ n') for line in open (file)]
for line in lines:
#extract the portion
You can have a variable that saves all the lines you have read since the last STARTINGWORK
.
When you finish processing the file you have just what you need.
Certainly you do not need to read all the lines to a list first. You can read it directly in the open file and that returns one line at a time. i.e.:
result = []
with open(file) as f:
for line in f:
if line.startswith("STARTINGWORK"):
result = [] # Delete what would have accumulated
result.append(line) # Add the last line read
print("".join(result))
In the result
you have everything after the last STARTINGWORK, inclusive you can keep the result [1:]
if you want to delete the initial STARTINGWORK
- Then in the code:
#list
result = []
#function
def appendlines(line, result, word):
if linea.startswith(word):
del result[:]
result.append(line)
return line, result
with open(file, "r") as lines:
for line in lines:
appendlines(line, result, "STARTINGWORK")
new_result = [line.rstrip("\n") for line in result[1:]]