Search code examples
phpstringencodingutf-8com-object

How to get UTF-8 string using COM object in PHP?


In my PHP code, I am instancing COM object from external DLL. It works OK except I have a problem receiving JSON string from one of the com object methods.

If the string does not have any non-Latin characters JSON I get is correct, but if there is at least one non-Latin character that requires UTF-8 encoding, that JSON when received from COM object is not parsable in PHP. json_last_error() shows problem with UTF-8 encoding.

I am positive that COM object returns correctly encoded strings as it is used in other environments and it works fine.

When I check received string contents, it is obvious that non-Latin characters are "encoded" in strange and invalid manner. When I check the same string within COM object, just before it is sent to PHP, it is correctly encoded.

It seems like PHP to COM object communication is done using non UTF-8 encoding and that messes up the string.

The only thing related to using UTF-8 with COM objects I found is setting com.code_page=UTF-8 in [COM] section of php.ini. However, regardless how this is set, I have the same bad behavior.

What else should I do to get a proper encoded UTF-8 string from COM object?


Solution

  • Well, answer was right in front of my eyes, I just overlooked:

    COM::__construct ( string $module_name [, mixed $server_name [, int $codepage [, string $typelib ]]] )

    There is codepage parameter. If set to CP_UTF8 it works.

    $server_name should be NULL if server is not used.