Ok, so basic python question. I have a simple script to replace text on html(txt) files. I wrote some code and put in some newline html that had xhtml coding i wanted to replace so I tried to write a python script to replace the xhtml coding versions with regular newline html. Sometimes when I run the code, it works fine, but sometimes it rewrites the file, and at the end the last two characters are repeated after a newline. I'm running 2.7 on OS X Lion, if that matters. Here's the code:
import sys
import re
def replace_text(filename):
with open(filename, 'r+') as f:
p = re.compile( '(</br>|<br/>|<br />)')
f_data = open(filename, 'r+').read()
f.write(p.sub('<br>', f_data))
f.close()
def main():
args = sys.argv[1:]
if not args:
print 'usage: [--summaryfile] file [file ...]'
sys.exit(1)
summary = False
if args[0] == '--summaryfile':
summary = Trye
del args[0]
for filename in args:
replace_text(filename)
if __name__ == '__main__':
main()
So when I run it with a file (say 'foo.txt') that has the content:
</br> <br/> <br /> <br> poop
sometimes it outputs:
<br> <br> <br> <br> poop
and sometimes it outputs:
<br> <br> <br> <br> poop
op
huh? Using OS X Lion with python 2.7. I'm missing something simple? Also using Gedit 3.2.6 as an editor. Any help?
You're shortening the content of the file, but you're not shortening its length. Don't forget to do so via .truncate()
before closing.