Search code examples
phparrayssortingusort

Sorting an array of strings by arbitrary numbers of substrings


I'm attempting to modify the OrderedImportsFixer class in php-cs-fixer so I can clean up my files the way I want. What I want is to order my imports in a fashion similar to what you'd see in a filesystem listing, with "directories" listed before "files".

So, given this array:

$indexes = [
    26 => ["namespace" => "X\\Y\\Zed"],
    9 =>  ["namespace" => "A\\B\\See"],
    3 =>  ["namespace" => "A\\B\\Bee"],
    38 => ["namespace" => "A\\B\\C\\Dee"],
    51 => ["namespace" => "X\\Wye"],
    16 => ["namespace" => "A\\Sea"],
    12 => ["namespace" => "A\\Bees"],
    31 => ["namespace" => "M"],
];

I'd like this output:

$sorted = [
    38 => ["namespace" => "A\\B\\C\\Dee"],
    3 =>  ["namespace" => "A\\B\\Bee"],
    9 =>  ["namespace" => "A\\B\\See"],
    12 => ["namespace" => "A\\Bees"],
    16 => ["namespace" => "A\\Sea"],
    26 => ["namespace" => "X\\Y\\Zed"],
    51 => ["namespace" => "X\\Wye"],
    31 => ["namespace" => "M"],
];

As in a typical filesystem listing:

screenshot of a directory listing

I've been going at uasort for a while (key association must be maintained) and have come close. Admittedly, this is due more to desperate flailing than any sort of rigorous methodology. Not really having a sense of how uasort works is kind of limiting me here.

// get the maximum number of namespace components in the list
$ns_counts = array_map(function($val){
    return count(explode("\\", $val["namespace"]));
}, $indexes);
$limit = max($ns_counts);

for ($depth = 0; $depth <= $limit; $depth++) {
    uasort($indexes, function($first, $second) use ($depth, $limit) {
        $fexp = explode("\\", $first["namespace"]);
        $sexp = explode("\\", $second["namespace"]);
        if ($depth === $limit) {
            // why does this help?
            array_pop($fexp);
            array_pop($sexp);
        }
        $fexp = array_slice($fexp, 0, $depth + 1, true);
        $sexp = array_slice($sexp, 0, $depth + 1, true);
        $fimp = implode(" ", $fexp);
        $simp = implode(" ", $sexp);
        //echo "$depth: $fimp <-> $simp\n";
        return strnatcmp($fimp, $simp);
    });
}
echo json_encode($indexes, JSON_PRETTY_PRINT);

This gives me properly sorted output, but with deeper namespaces on the bottom instead of the top:

{
    "31": {
        "namespace": "M"
    },
    "12": {
        "namespace": "A\\Bees"
    },
    "16": {
        "namespace": "A\\Sea"
    },
    "3": {
        "namespace": "A\\B\\Bee"
    },
    "9": {
        "namespace": "A\\B\\See"
    },
    "38": {
        "namespace": "A\\B\\C\\Dee"
    },
    "51": {
        "namespace": "X\\Wye"
    },
    "26": {
        "namespace": "X\\Y\\Zed"
    }
}

I'm thinking I may have to build a separate array for each level of namespace and sort it separately, but have drawn a blank on how I might do that. Any suggestions for getting the last step of this working, or something completely different that doesn't involve so many loops?


Solution

  • I believe the following should work:

    uasort($indexes, static function (array $entry1, array $entry2): int {  
        $ns1Parts = explode('\\', $entry1['namespace']);
        $ns2Parts = explode('\\', $entry2['namespace']);
    
        $ns1Length = count($ns1Parts);
        $ns2Length = count($ns2Parts);
    
        for ($i = 0; $i < $ns1Length && isset($ns2Parts[$i]); $i++) {
            $isLastPartForNs1 = $i === $ns1Length - 1;
            $isLastPartForNs2 = $i === $ns2Length - 1;
    
            if ($isLastPartForNs1 !== $isLastPartForNs2) {
                return $isLastPartForNs1 <=> $isLastPartForNs2;
            }
    
            $nsComparison = $ns1Parts[$i] <=> $ns2Parts[$i];
    
            if ($nsComparison !== 0) {
                return $nsComparison;
            }
        }
    
        return 0;
    });
    

    What it does is:

    • split namespaces into parts,
    • compare each part starting from the first one, and:
      • if we're at the last part for one and not the other, prioritize the one with the most parts,
      • otherwise, if the respective parts are different, prioritize the one that is before the other one alphabetically.

    Demo