Search code examples
pythonhtmlscriptingutf-16utf

search /replace text in html files with python


I'm trying to search and replace text inside multiple html files with the code below, it works with .txt, not with the html converted to .txt. Is it a utf-16 problem? How could i make it work?

import os
directory ="/Users/sinanatra/PYTHON_STUFF/MSN/0/"

replacement = "test"
for dname, dirs, files in os.walk(directory):
    for fname in files:
        fpath = os.path.join(dname, fname)
        with open(fpath) as f:
            s = f.read()
        s = s.replace("head", replacement)
        with open(fpath, "w") as f:
            f.write(s)                

Solution

  • if you use utf-16 then you need

    s.read().decode('utf-16') and for write you will need:

    f.write(s.encode('utf16'))