Search code examples
pythonnon-ascii-characters

Put u in front of text python


I try to replace non-ascii characters with ascii ones.

It works well:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
from unidecode import unidecode

in_text = u"protégé"

out = unidecode(in_text)

print out

result: protA(c)gA(c)

In this case I have to copy text manually.

The problem is in 'u' in front of text.

I'd like to read automatically. Something like this:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
from unidecode import unidecode

with open("C:\Users\B\Desktop\\0.txt", "r") as f:
    in_text = f.read()
    
char_text = u(in_text)

out = unidecode(char_text)

python 2.7 https://pypi.org/project/Unidecode/


Solution

  • Fix for python2 only:

    from unidecode import unidecode
    import io
    
    with io.open("C:\Users\B\Desktop\\0.txt", "r", encoding="utf-8") as file:
        for line in file:
            char_text = u"{}".format(line)
            out = unidecode(char_text)
        print(out)