I have a git repository that has a UTF-16 file in it. Its only UTF-16 by accident, the file could be encoded in 7-bit ascii without a loss of data. I'd like to use something like reposurgeon to convert the file to UTF-8 so that git diff will work with older revisions of the file and I don't have to resort to git difftool. Is this possible?
Why don't you just covert the file to UTF-8 and commit it, e.g. with:
iconv -f UTF-16 -t UTF-8 input-file.txt > input-file.txt.fixed
# Check here that the conversion worked OK
mv -i input-file.txt.fixed input-file.txt
git commit -m 'Convert input-file.txt from UTF-16 to UTF-8' input-file.txt
Update after a clarifying comment:
If you want to rewrite that file at every commit in the history of HEAD
, you can use git filter-branch
, something like:
git filter-branch --tree-filter \
'iconv -f UTF-16 -t UTF-8 input-file.txt > input-file.txt.fixed &&
mv input-file.txt.fixed input-file.txt' HEAD
Of course, if you're rewriting history in this way, it may cause problems if you have shared this repository with anyone. (I haven't tested that command - use it with care, probably only a new clone of your repository.)