Search code examples
pythonunicodeutf-8python-unicodedevanagari

Reading and Writing Devanagri Hindi characters in text file


Having trouble reading and writing Hindi Devanagari characters in text files using Python.

  1. Reading:
    Python is unable to read Devanagari characters in my text file when following code is implemented.

    Code:

         f=open(r"C:\Users\Dell\Desktop\abc1.txt","w")
         print(f.read())
         f.close()
    

    O/P: म (instead of म)

    While the same code produces correct output for '&' symbols in my file as follows

    O/P: &

  2. Writing:
    Following implementation throws an error message.
    Unicode for म being 092e
    Code:

      f=open(r"C:\Users\Dell\Desktop\abc1.txt","w")
      f.write(u"\u092e")
      f.close()
    

    Error Message:

     Exception has occurred: UnicodeEncodeError
     'charmap' codec can't encode character '\u092e' in position 0: character maps to <undefined>
       File "C:\Users\Dell\Desktop\Python\gg.py", line 2, in <module>
         f.write(u"\u092e")
    

While the character writes successfully on standard output as follows:

Code:

print(u"\u092e")

O/P: म

Kindly explain why so? How can I read and write Devanagari characters on text file? Any alternatives?

enter image description here


Solution

  • Have you tried opening the file with UTF8 or UTF16? It depends on how your source file is saved (which encoding is used).

    For example, try:

    with open(r'C:\Users\Dell\Desktop\abc1.txt','r', encoding='utf-16') as f:
        print(f.read())