Search code examples
python-3.xoracle-databasecx-oracle

Japanese characters showing up as question marks on querying Oracle DB, using cx_Oracle


I'm using cx_Oracle v7.1.3 with Python 3.6.4. My current Oracle Database has both English, Japanese and numeric values. The English and numeric values are being retrieved fine, but the Japanese text shows up as '???'.

My first thought was this had to do something with the encoding, maybe the default ASCII would be raising a problem, so I used str.encode() to try and encode the Japanese strings to UTF8, but to no avail. On printing them, they simply show b'????'

with open('get_table_names.sql', 'r') as file:
  for line in file:
     SQL_QUERY = str(line)

cursor.execute(SQL_QUERY)
# Inner cursor to read inner fields
new_cursor = connection.cursor()
for desc, table_name in cursor:
  # print(cursor.description)
  new_cursor.execute("SELECT * FROM {}".format(table_name.lower()))    

The with, open block simply reads an SQL query from a file. The cursor executes that and retrieves the names of all the tables to be queried and the new_cursor is used to read every single table.

Any and all help will be appreciated.


Solution

  • Either set the NLS_LANG environment variable with your character set before you start Python, or (easier) use encoding when you connect, see https://cx-oracle.readthedocs.io/en/latest/module.html#cx_Oracle.connect

    Use something like:

    conn = cx_Oracle.connect("user/password@hostname/servicename", encoding="UTF-8", nencoding="UTF-8")
    

    You may need a different encoding.