I'm trying to parse a JSON response from a web service I have no control over.
These are the headers
This is the body I see in php with sensitive parts hidden
I'm using guzzle http client to send the request and to retrieve the response
If I try to decode it directly I receive an empty object so I'm assuming a conversion is needed so I am trying to convert the response contents like this
json_decode(iconv($charset, 'UTF-8', $contents))
or
mb_convert_encoding($contents, 'UTF-8', $charset);
both of which throw an exception.
Notice: iconv(): Wrong charset, conversion from 'windows-1253' to 'UTF-8' is not allowed in Client.php on line 205
Warning: mb_convert_encoding(): Illegal character encoding specified in Client.php on line 208
I've used this piece of code successfully before but I can't understand why it fails now.
Sending the same request using POSTMAN correctly retrieves the data without broken characters and it seems to show the same headers and body received.
I'm updating based on comments.
mb_detect_encoding($response->getBody())
-> UTF-8
mb_detect_encoding($response->getBody->getContents())
-> ASCII
json_last_error_msg
-> Malformed UTF-8 characters, possibly incorrectly encoded
Additionally as a trial and error attempt I tried all iconv encodings to see if any could convert it to utf-8 without an error to detect the encoding using this one
private function detectEncoding($str){
$iconvEncodings = [...]
$finalEncoding = "unknown";
foreach($iconvEncodings as $encoding){
try{
iconv($encoding, 'UTF-8', $str);
return $encoding;
}
catch (\Exception $exception){
continue;
}
}
return $finalEncoding;
}
Apparently no encoding worked and everything gave the same exception. I'm assuming the problem is with retrieving the response json correctly via guzzle and not with iconv itself. It can't be that it's not any of the 1000+ ones.
Some more info with CURL
I just retried the same payload using CURL
/**
* @param $options
* @return bool|string
*/
public function makeCurlRequest($options)
{
$payload = json_encode($options);
// Prepare new cURL resource
$ch = curl_init($this->softoneurl);
curl_setopt_array($ch, [
CURLOPT_RETURNTRANSFER => true, // return web page
CURLOPT_HEADER => false, // don't return headers
CURLOPT_FOLLOWLOCATION => true, // follow redirects
CURLOPT_MAXREDIRS => 10, // stop after 10 redirects
CURLOPT_ENCODING => "", // handle compressed
CURLOPT_USERAGENT => "test", // name of client
CURLOPT_AUTOREFERER => true, // set referrer on redirect
CURLOPT_CONNECTTIMEOUT => 120, // time-out on connect
CURLOPT_TIMEOUT => 120, // time-out on response
CURLINFO_HEADER_OUT => true,
CURLOPT_POST => true,
CURLOPT_POSTFIELDS => $payload,
]);
// Set HTTP Header for POST request
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'Content-Type: application/json',
'Content-Length: ' . strlen($payload))
);
// Submit the POST request
$result = curl_exec($ch);
// Close cURL session handle
curl_close($ch);
return $result;
}
I received the exact same string and the exact same results with converting it. Perhaps an option I'm missing?
Apparently there's something wrong with iconv itself in the environment and it's not application specific. Running the following code via SSH
php -r "var_dump(iconv('Windows-1253', 'UTF-8', 'test'));"
yields
PHP Notice: iconv(): Wrong charset, conversion from `Windows-1253' to `UTF-8' is not allowed in Command line code on line 1
PHP Stack trace:
PHP 1. {main}() Command line code:0
PHP 2. iconv(*uninitialized*, *uninitialized*, *uninitialized*) Command line code:1
Command line code:1:
bool(false)
Perhaps some dependency is missing
About 14 hours of troubleshooting later I'm able to answer my own question correctly. In my case since this was running in the context of a CLI command, it caused an issue due to missing libraries. Basically the CLI php binary didn't have access to some libraries iconv needed.
More specifically the gconv libraries. In my case in Debian 9 it was located in
/usr/lib/x86_64-linux-gnu/gconv
and this folder contains a lot of libraries for each encoding used. A good way to understand this is if you run in a system you have root access the command
strace iconv -f <needed_encoding> -t utf-8
It will yield a lot of folders that iconv tries to access including the gconv folder and will point you to the location of the ones you need to include in your SSH environment. If you don't have access as root you have to ask your hosting provider.