Search code examples
phpunicodeoracle11goci8

PHP OCI8 Unable to read Unicode chars from NCHAR (Oracle 11g)


I'm spending days trying to read Unicode characters (Simplified Chinese names) from an Oracle 11g via PHP7 OCI8 (tried with PDO_OCI and ODBC and in PHP5 as well, same problem) on Linux Ubuntu server 14.04.

This are the Oracle 11g settings:

NLS_LANGUAGE    AMERICAN
NLS_TERRITORY   AMERICA
NLS_CHARACTERSET    WE8MSWIN1252
NLS_SORT    BINARY
NLS_NCHAR_CHARACTERSET  AL16UTF16
NLS_COMP    BINARY
NLS_LENGTH_SEMANTICS    BYTE
NLS_NCHAR_CONV_EXCP FALSE

Important The table column I'm trying to read is defined as NCHAR(40). I use php-7.0.8 NTS, oci8-2.0.11 (from PECL) and installed Oracle instantclient

This is the php code I'm using:

<?php
$fp = fopen('output.txt','w');
$conn = oci_connect('MYUSER', 'MYPASSWORD', 'xxx.xxx.xxx.xxxx/DBTEST','AL32UTF8');

$query = "SELECT ABALPH FROM CRPDTA.F0101 WHERE ROWNUM <= 1 ORDER BY NULL";
$stid = oci_parse($conn, $query);
oci_execute($stid, OCI_NO_AUTO_COMMIT);

while ($row = oci_fetch_array($stid, OCI_ASSOC+OCI_RETURN_NULLS)) {
    fwrite($fp, $row['ABALPH']."\n");
}
fclose($fp);
?>

Then I can see English names but I get ¿¿¿¿¿¿¿¿¿¿¿¿¿ into my txt file (UTF8 w/o BOM format) for Chinese names. I think this is related to the NCHAR(40) datatype.

Please, note I've set the Oracle env variables directly (through export and/or putenv()). When I use sqlplus command I can get to export readable Chinese names into a file I set export NLS_LANG ="SIMPLIFIED CHINESE_CHINA.ZHS16GBK"

Any idea about how to solve this? I'd greatly appreciate it! Regards, Sergio


Solution

  • I think I found the solution in this blog post.

    Reading the article's comments, basically the NCHAR field needs to be converted to RAW/HEX:

    $query = "SELECT UTL_RAW.cast_to_raw(convert(ABALPH,'AL32UTF8')) FROM CRPDTA.F0101 WHERE ROWNUM <= 1 ORDER BY NULL";
    

    If you use oci8 extension in PHP, there is no need to further convert the hex to UTF8 as, according to the post, it's automagically done by PHP. However, if you use PDO_OCI then you need to convert your hex field by using hex2bin() before using it as UTF8 string.

    Hope this could help.