Search code examples
pythonpython-3.xglob

Python remove all lines starting with pattern


So I have 7000+ txt files that look something like this:

1 0.51 0.73 0.81

0 0.24 0.31 0.18

2 0.71 0.47 0.96

1 0.15 0.25 0.48

And as output I want:

0 0.24 0.31 0.18

2 0.71 0.47 0.96

I wrote the code combining multiple sources and it looks like this:

    #!/usr/bin/env python3
  2 import glob
  3 import os
  4 import pathlib
  5 import re
  6 path = './*.txt'
  7 
  8 for filename in glob.glob(path):
  9     with open(filename, 'r') as f:
 10         for line in f.readlines():
 13             if not (line.startswith('1')):
 14                 print(line)
 15                 out = open(filename, 'w')
 16                 out.write(line)
 17         f.close()

But the output for the upper example is:

2 0.71 0.47 0.96

How can I fix the code to give me the correct output?


Solution

  • The problem is that you're re-initializing the output file on every row. This can be fixed by opening the output file earlier and using it for every line.

    #!/usr/bin/env python3
    from glob import glob
    import os
    import pathlib
    import re
    
    for filename in glob('./*.txt'):
        with open(filename,'r') as original_file:
            original_lines=original_file.readlines()
        with open(filename,'w') as updated_file:
            updated_file.writelines(
                line
                for line in original_lines
                if not line.startswith('1')
            )