Search code examples
phpquotesjson

JSON Encode and curly quotes


I've run into an interesting behavior in the native PHP 5 implementation of json_encode(). Apparently when serializing an object to a json string, the encoder will null out any properties that are strings containing "curly" quotes, the kind that would potentially be copy-pasted out of MS Word documents with the auto conversion enabled.

Is this an expected behavior of the function? What can I do to force these kinds of characters to covert to their basic equivalents? I've checked for character encoding mismatches between the database returning the data and the administration page the inserts it and everything is setup correctly - it definitely seems like the encoder just refuses these values because of these characters. Has anyone else encountered this behavior?

EDIT:

To clarify;

MSWord will take standard quotation marks and apostraphes and convert them to more aesthetic "fancy" or "curly" quotes. These characters can cause problems when placed in content managers that have charset mistmatches between their editing interface (in the html) and the database encoding.

That's not the problem here, though. For example, I have a json_object representing a person's profile and the string:

Jim O’Shea

The UTF code for that apostraphe being \u2019

Will come out null in the json object when fetched from database and directly json_encoded.

{"model_name":"Bio","logged":true,"BioID":"17","Name":null,"Body":"Profile stuff!","Image":"","Timestamp":"2011-09-23 11:15:24","CategoryID":"1"}


Solution

  • Never had this specific problem (i.e. with json_encode()) but a simple - albeit a bit ugly - solution I have used in other places is to loop through your data and pass it through this function I got from somewhere (will credit it when I find out where I got it):

    function convert_fancy_quotes ($str) {
      return str_replace(array(chr(145),chr(146),chr(147),chr(148),chr(151)),array("'","'",'"','"','-'),$str);
    }