Search code examples
phparrayssortingiconv

ignore accented characters while sorting in php in multidimensional array


I have multidimensional array as shown below in which I want to do sorting on the basis of [name] field. Also, accented letters should sort as though they are unaccented.

Array
(
    [chicago] => Array
        (
            [community_name] => Chicago, IL
            [areas] => Array
                (
                    [0] => Array
                        (
                            [name] => Array
                                (
                                    [0] => HELLO WORLD.
                                )
                        )

                    [1] => Array
                        (
                            [name] => Array
                                (
                                    [0] => Hello
                                )

                        )

                    [2] => Array
                        (
                            [name] => Array
                                (
                                    [0] => Administration.
                                )
                        )
                )

        )

    [chicago-and-surrounding-areas] => Array
        (
            [community_name] => Chicago (and surrounding areas), IL
            [areas] => Array
                (
                    [0] => Array
                        (
                            [name] => Array
                                (
                                    [0] => Covit Corp. 
                                )
                        )
                    [1] => Array
                        (
                            [name] => Array
                                (
                                    [0] => Câble-Axion Digital Corp. 
                                )
                        )   
                )

        )

    [cambridge-chicago] => Array
        (
            [community_name] => Cambridge (Chicago), IL
            [areas] => Array
                (
                    [0] => Array
                        (
                            [name] => Array
                                (
                                    [0] => Avocados.
                                )
                        )
                    [1] => Array
                        (
                            [name] => Array
                                (
                                    [0] => Aṕple.
                                )
                        )   
                )

        )

)

This is what I want to achieve:

Array
(
    [chicago] => Array
        (
            [community_name] => Chicago, IL
            [areas] => Array
                (
                    [0] => Array
                        (
                            [name] => Array
                                (
                                    [0] => Administration.
                                )
                        )

                    [1] => Array
                        (
                            [name] => Array
                                (
                                    [0] => HELLO WORLD. 
                                )

                        )

                    [2] => Array
                        (
                            [name] => Array
                                (
                                    [0] => Hello
                                )
                        )
                )

        )

    [chicago-and-surrounding-areas] => Array
        (
            [community_name] => Chicago (and surrounding areas), IL
            [areas] => Array
                (
                    [0] => Array
                        (
                            [name] => Array
                                (
                                    [0] => Câble-Axion Digital Corp.
                                )
                        )
                    [1] => Array
                        (
                            [name] => Array
                                (
                                    [0] => Covit Corp. 
                                )
                        )   
                )

        )

    [cambridge-chicago] => Array
        (
            [community_name] => Cambridge (Chicago), IL
            [areas] => Array
                (
                    [0] => Array
                        (
                            [name] => Array
                                (
                                    [0] => Aṕple.
                                )
                        )
                    [1] => Array
                        (
                            [name] => Array
                                (
                                    [0] => Avocados.
                                )
                        )   
                )

        )

)

This is what I have tried but I am wondering if its gonna work in all cases. In some cases even after sorting accented letters rank lower than their non-accented counterparts.

I am wondering what changes I should make in the code below so that accented letters should sort as though they are unaccented.

foreach ($array as &$locality) {
    usort($locality['areas'], function ($a, $b) {
        // return $a['name'][0] <=> $b['name'][0];
        return iconv('UTF-8', 'ISO-8859-8//TRANSLIT', $a['name'][0]) <=> iconv('UTF-8', 'ISO-8859-8//TRANSLIT', $b['name'][0]);
    });
}

Solution

  • You can use Normalizer to split chars from diacritics and remove them after to get the 'base'-chars.

    function stripDiacritics(string $string): string {
        return preg_replace(
            '/[\x{0300}-\x{036f}]/u',
            '',
            Normalizer::normalize($string , Normalizer::FORM_D)
        );
    }
    
    foreach ($array as &$locality) {
        usort($locality['areas'], function ($a, $b) {
            return stripDiacritics($a['name'][0]) <=> stripDiacritics($b['name'][0]);
        });
    }    
    

    Working example.

    Strip from here.

    Next time use var_export, so we can use your array to test the code :)

    List of diacritics (source of \x{0300}-\x{036f}).