Search code examples
phpstringencodingcharacter-encoding

How to replace Microsoft-encoded quotes in PHP


I need to replace Microsoft Word's version of single and double quotations marks (“ ” ‘ ’) with regular quotes (' and ") due to an encoding issue in my application. I do not need them to be HTML entities and I cannot change my database schema.

I have two options: to use either a regular expression or an associated array.

Is there a better way to do this?


Solution

  • Considering you only want to replace a few specific and well identified characters, I would go for str_replace with an array: you obviously don't need the heavy artillery regex will bring you ;-)

    And if you encounter some other special characters (damn copy-paste from Microsoft Word...), you can just add them to that array whenever is necessary / whenever they are identified.


    The best answer I can give to your comment is probably this link: Convert Smart Quotes with PHP

    And the associated code (quoting that page):

    function convert_smart_quotes($string) 
    { 
        $search = array(chr(145), 
                        chr(146), 
                        chr(147), 
                        chr(148), 
                        chr(151)); 
    
        $replace = array("'", 
                         "'", 
                         '"', 
                         '"', 
                         '-'); 
    
        return str_replace($search, $replace, $string); 
    } 
    

    (I don't have Microsoft Word on this computer, so I can't test by myself)

    I don't remember exactly what we used at work (I was not the one having to deal with that kind of input), but it was the same kind of stuff...