Search code examples
pythonmacosgedit

python file in-out adding last three characters


Ok, so basic python question. I have a simple script to replace text on html(txt) files. I wrote some code and put in some newline html that had xhtml coding i wanted to replace so I tried to write a python script to replace the xhtml coding versions with regular newline html. Sometimes when I run the code, it works fine, but sometimes it rewrites the file, and at the end the last two characters are repeated after a newline. I'm running 2.7 on OS X Lion, if that matters. Here's the code:

import sys
import re

def replace_text(filename): 
    with open(filename, 'r+') as f:
        p = re.compile( '(</br>|<br/>|<br />)')
    f_data = open(filename, 'r+').read()
    f.write(p.sub('<br>', f_data))
    f.close()

def main():
args = sys.argv[1:]

if not args:
    print 'usage: [--summaryfile] file [file ...]'
    sys.exit(1)

summary = False

if args[0] == '--summaryfile':
    summary = Trye
    del args[0]

for filename in args:
    replace_text(filename)

if __name__ == '__main__':
main()

So when I run it with a file (say 'foo.txt') that has the content:

</br> <br/> <br /> <br> poop

sometimes it outputs:

<br> <br> <br> <br> poop

and sometimes it outputs:

<br> <br> <br> <br> poop
op

huh? Using OS X Lion with python 2.7. I'm missing something simple? Also using Gedit 3.2.6 as an editor. Any help?


Solution

  • You're shortening the content of the file, but you're not shortening its length. Don't forget to do so via .truncate() before closing.