Search code examples
phputf-8mbstring

php mb_convert_variables RECURSION error


I'm having a problem on converting some arrays to UTF-8

Basically I'm extracting meta tags from website, and where the charset is not UTF-8, I attempt to convert them to UTF-8 so they can be stored and displayed properly. A original array before conversion is as follows.

print_r($details);


    [base] => http://www.example.com/something/page/
    [charset] => iso-8859-1
    [favicon] => http://www.example.com/favicon.ico
    [meta] => Array
        (
            [description] => Some Description
            [keywords] => 
        )

    [images] => Array
        (
            [0] => http://cdn.example.com/wp-content/themes/original/images/logo.jpg
            [1] => http://cdn.examplecom/wp-content/uploads/2016/10/EXAMPLE-imageoptim-twitter-bird-16x16.png

        )

    [openGraph] => Array
        (
            [locale] => en_GB
            [type] => article
            [title] => Some Title
            [description] => Some Description
            [url] => http://www.example.com/something/page/
            [site_name] => EXAMPLE
            [image] => http://cdn-r1.example.com/wp-content/uploads/2017/02/621-Example-fb.jpg
            [image:width] => 736
            [image:height] => 378
            [imagePath] => http://cdn-r1.example.com/wp-content/uploads/2017/02/621-Example-fb.jpg
        )

    [title] => Some Title
    [url] => http://www.example.com/something/page/
    [url_description] => Some Description

    //End of print_r();

So the array on top is all nice and proper, but because I won't be able to know whether the text will be displayed properly or not, I will convert it to UTF-8 since the website stats its charset is NOT utf-8.

I put the array above through the following

mb_convert_variables('utf-8', $details['charset'], $details);

Note the output is strange for $details['meta'] and $details['openGraph']. The array has been replaced by RECURSION. I tried to google this but i cannot find anything.

print_r($details);

//Note: This is the exact print_r results with the *RECURSIVE* words.

    [base] => http://www.example.com/something/page/
    [charset] => iso-8859-1
    [favicon] => http://www.example.com/favicon.ico
    [meta] => Array
 *RECURSION*

    [images] => Array
        (
            [0] => http://cdn.example.com/wp-content/themes/original/images/logo.jpg
            [1] => http://cdn.examplecom/wp-content/uploads/2016/10/EXAMPLE-imageoptim-twitter-bird-16x16.png

        )

    [openGraph] => Array
*RECURSION*
    [title] => Some Title
    [url] => http://www.example.com/something/page/
    [url_description] => Some Description

Because of the above, i CANNOT output it as json with

echo json_encode($details);
die();

HOWEVER... if i were to serialize it and unserialize it, it's okay once again.

echo json_encode(unserialize(serialize($details)));
die();

May I know what's the problem with my array or my codes? I can work with my current serialize and unserialize operation, but I would rather know the problem before it affects all my future data.


Solution

  • its bug, already reported, manually loop throught and corvert each string with mb_convert_encoding

    function mb_convert_array($to_encoding, $from_encoding, $array)
    {
        foreach($array as $key => $value)
        {
            if(is_array($value))
            {
                $array[$key] = mb_convert_array($to_encoding, $from_encoding, $value);
            }
            else
            {
                $array[$key] = mb_convert_encoding($value, $to_encoding, $from_encoding);
            }
        }
    
        return $array;
    }
    

    mb_convert_array('utf-8', $details['charset'], $details);