I've been trying numerous issues here in stack.overflow to remove the last blank lines from the 2.txt
file (input):
2.txt file:
-11
B1
5
B1
-2
B1
7
B1
-11
B1
9
B1
-1
B1
-3
B1
19
B1
-22
B1
2
B1
1
B1
18
B1
-14
B1
0
B1
11
B1
-8
B1
-15
and the only one that worked using print(line)
was this https://stackoverflow.com/a/6745865/10824251. But when I try to use f.write(line)
rather than print(line)
in my final 2.txt
file (output) is as shown below:
2.txt file final:
-11B15B1-2B17B1-11B19B1-1B1-3B119B1-22B12B11B118B1-14B10B111B1-8B1-15
18
B1
-14
B1
0
B1
11
B1
-8
B1
-15
However, when I use the code using print line)
instead of f.write (line)
, my bash terminal displays output with the last lines deleted (see print(line) result in terminal bash
below) but with deformation equal to 2.txt file final
, ie it works correctly. I have tried to understand what is happening but have not made any progress.
print(line) resut in terminal bash
-11B15B1-2B17B1-11B19B1-1B1-3B119B1-22B12B11B118B1-14B10B111B1-8B1-15
18
B1
-14
B1
0
B1
11
B1
-8
B1
-15
UPDATE:
My script eliminating the last lines of 2.txt
file but deforming the first lines of in the terminal bash:
for line in open('2.txt'):
line = line.rstrip()
if line != '':
print (line)
My script deforming the first lines of 2.txt
fileand also does not delete the last lines as desired in file output 3.txt
:
with open("2.txt",'r+') as f:
for line in open('3.txt'):
line = line.rstrip()
if line != '':
f.write(line)
rstrip()
removes the trailing newline in addition to other content, so when you write the result, it leaves the cursor on the end of the same line.
One way to fix it that's clear about what needs to change (all code unmodified but for addition of the last line):
with open("2.txt",'r+') as f:
for line in open('3.txt'):
line = line.rstrip()
if line != '':
f.write(line)
f.write(os.linesep) # one extra line
Alternately, you could change f.write(line)
to print(line, file=f)
.
If you need to trim a small number of blank lines from the end of an arbitrarily-large file, it makes sense to skip to the end of that file and work backwards; that way, you don't care how large the whole file is, but only how much content needs to be removed.
That is, something like:
import os, sys
block_size = 4096 # 4kb blocks; decent chance this is your page size & disk sector size.
filename = sys.argv[1] # or replace this with a hardcoded name if you prefer
with open(filename, 'r+b') as f: # seeking backwards only supported on files opened binary
while True:
f.seek(0, 2) # start at the end of the file
offset = f.tell() # figure out where that is
f.seek(max(0, offset - block_size), 0) # move up to block_size bytes back
offset = f.tell() # figure out where we are
trailing_content = f.read() # read from here to the end
new_content = trailing_content.rstrip() # remove all whitespace
if new_content == trailing_content: # nothing to remove?
break # then we're done.
if(new_content != ''): # and if post-strip there's content...
f.seek(offset + len(new_content)) # jump to its end...
f.write(os.linesep.encode('utf-8')) # ...write a newline...
f.truncate() # and then delete the rest of the file.
break
else:
f.seek(offset, 0) # go to where our block started
f.truncate() # and delete *everything* after it
# run through the loop again, to see if there's still more trailing whitespace.