Search code examples
pythoncsvtruncatewritefile

How to overwrite a file?


This question is directly linked to my "How to modify a tsv-file column with Python" question. Briefly: I'd like to overwrite the first column of a TSV file, by changing a certain symbol (in_char) with another one (out_char). In order to write over the original file, I thought to use the .truncate() method by writing this:

with open(my_file, "r+") as mf:
    lines = [line.rstrip() for line in mf]
    for line in lines:
        line = line.replace(in_char, out_char, 1)
        mf.seek(0)
        mf.write(line)
        mf.truncate()
mf.close()

Actually the file is correctly overwritten but only with the last row of the TSV, so I basically obtain a TSV with one row.

For example if my in_char is the "|" symbol and my out_char is the "_" symbol, from my original TSV:

A|circ  properties  m4  298 298 28  +   .   coverage=81;
B|circ  properties  m4  307 307 40  -   .   coverage=74;
C|circ  properties  m4  361 361 23  +   .   coverage=77;

This is what I obtain:

C_circ  properties  m4  361 361 23  +   .   coverage=77;

Where am I doing it wrong?


Solution

  • The problem is that you are modifying the file as you read it. I suggest you take one of two approaches:

    1. Read the entire file into memory, make the modifications, then write the file back out.

    2. Create a temporary file to write to. Read the input file one line at a time, make the changes and write each line to the temporary file. Then rename the temporary file back to the original.

    As an aside, I suggest using the standard csv module for this. In particular, DictReader and DictWriter make this task straightforward.