I'm getting a JSON string from an api like this :
{ "title":"Example string's with "special" characters" }
which is not json decodable by using json_decode (its output is null).
so I want to change it to something json decodable like :
{ "title":"Example string's with \"special\" characters" }
or
{ "title":"Example string's with 'special' characters" }
in order to make json_decode function work, what should I do ?
Since yesterday I was trying to solve this tricky problem, and after a lot of hair pulling I came up with this solution.
First let us clarify our assumptions.
Analyzing the problem:
We know that json keys formated like this (,"keyString":) and json value is (:"valueString",)
keyString: is any sequence of characters except (:").
valueString: is any sequence of characters except (",).
Our goal is to escape quotations inside valueString, to achive that we need to separate keyStrings and valueStrings.
Now after analyzing the problem we can say
The solution: Using this facts the code will be
function escapeJsonValues($json_str){
$patern = '~(?:,\s*"((?:.(?!"\s*:))+.)"\s*(?=\:))(?:\:\s*(?:(\d+)|("\s*")|(?:"((?!\s*")(?:.(?!"\s*,))+.)")))~';
//remove { }
$json_str = rtrim(trim(trim($json_str),'{'),'}');
if(strlen($json_str)<5) {
//not valid json string;
return null;
}
//put , at the start nad the end of the string
$json_str = ($json_str[strlen($json_str)-1] ===',') ?','.$json_str :','.$json_str.',';
//strip all new lines from the string
$json_str=preg_replace('~[\r\n\t]~','',$json_str);
preg_match_all($patern, $json_str, $matches);
$json='{';
for($i=0;$i<count($matches[0]);$i++){
$json.='"'.$matches[1][$i].'":';
//value is digit
if(strlen($matches[2][$i])>0){
$json.=$matches[2][$i].',';
}
//no value
elseif (strlen($matches[3][$i])>0) {
$json.='"",';
}
//text, and now we can see if there is quotations it will be related to the text not json
//so we can add slashes safely
//also foreword slashes should be escaped
else{
$json.='"'.str_replace(['\\','"' ],['/','\"'],$matches[4][$i]).'",';
}
}
return trim(rtrim($json,','),',').'}';
}
Note: The code realizes white spaces.