Search code examples
javapostgresqlmultilingual

Insert in postgres DB - Hindi


I am creating a Java application where I would like to enter data in languages other then English. E.g. I want to enter data in Hindi (UTF-8) character. I have converted the data to hex string '\xe0\xa4\xa8\xe0\xa4\xbe\xe0\xa4\x97\xe0\xa4\xb0\xe0\xa4\xbf\xe0\xa4\x95'

However, when I try to transform the data back to Hindi using convert_from I'm getting the below error:

select convert_from('\xe0\xa4\xa8\xe0\xa4\xbe\xe0\xa4\x97\xe0\xa4\xb0\xe0\xa4\xbf\xe0\xa4\x95', 'UTF8') aa

Error:

ERROR:  invalid hexadecimal digit: "\"
LINE 1: select convert_from('\xe0\xa4\xa8\xe0\xa4\xbe\xe0\xa4\x97\xe...
                            ^
********** Error **********

ERROR: invalid hexadecimal digit: "\"
SQL state: 22023
Character: 21

I am using Postgres.


Solution

  • You have several options. If your database is already in UTF8 (and the client encoding is also UTF8), just use the Escape literals:

    select E'\xe0\xa4\xa8\xe0\xa4\xbe\xe0\xa4\x97\xe0\xa4\xb0\xe0\xa4\xbf\xe0\xa4\x95'
    

    Use the convert_from() & the escape literals when your database is not UTF8 (but your client encoding is UTF8):

    select convert_from(E'\xe0\xa4\xa8\xe0\xa4\xbe\xe0\xa4\x97\xe0\xa4\xb0\xe0\xa4\xbf\xe0\xa4\x95', 'UTF8')
    

    Use convert_from() & decode() with simple literals when neither your database, nor your client encoding is UTF8:

    convert_from(decode('e0a4a8e0a4bee0a497e0a4b0e0a4bfe0a495', 'hex'), 'UTF8')
    

    http://rextester.com/SPGFDP50612

    I have converted the data to hex string -- Anyway, these problems shouldn't arise when you use prepared statements & bind your original data as a parameter for it. Your language's bindings should handle the conversions, instead of you.