I try to replace non-ascii characters with ascii ones.
It works well:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from unidecode import unidecode
in_text = u"protégé"
out = unidecode(in_text)
print out
result: protA(c)gA(c)
In this case I have to copy text manually.
The problem is in 'u' in front of text.
I'd like to read automatically. Something like this:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from unidecode import unidecode
with open("C:\Users\B\Desktop\\0.txt", "r") as f:
in_text = f.read()
char_text = u(in_text)
out = unidecode(char_text)
python 2.7 https://pypi.org/project/Unidecode/
Fix for python2 only:
from unidecode import unidecode
import io
with io.open("C:\Users\B\Desktop\\0.txt", "r", encoding="utf-8") as file:
for line in file:
char_text = u"{}".format(line)
out = unidecode(char_text)
print(out)