Search code examples
phparraysperformancesorting

Performance tips for finding unique permutation


TLDR: how to find multidimensional array permutation in php and how to optimize for bigger arrays?

This is continuation of this question: how to find multidimensional array permutation in php

we have script for sorting array, idea is to find unique permutation of array, rules to find this permutation are:

  1. Input array contains set of arrays.
  2. Each inner array contains unique elements.
  3. Each inner array may have different length and different values.
  4. Output array must contain exact same values.
  5. Output inner array must have unique values on same key.
  6. If there is no solution, wildcard ie.: null are allowed.
  7. Wildcards can be duplicated on same key.
  8. Solution should have as few wildcards as possible.
  9. Algorithm should be able to handle array up to 30x30 in less than 180 s.

i have this solution so far:

function matrix_is_solved(array $matrix) {
    foreach (array_keys(current($matrix)) as $offset) {
        $column = array_filter($raw = array_column($matrix, $offset));
        if (count($column) != count(array_unique($column))) return false;
    }
    return true;
}

function matrix_generate_vectors(array $matrix) {
    $vectors = [];
    $columns = count(current($matrix));
    $gen = function ($depth=0, $combo='') use (&$gen, &$vectors, $columns) {
        if ($depth < $columns)
             for ($i = 0; $i < $columns; $i++)
                $gen($depth + 1, $i . $combo);
        else
            $vectors[] = array_map('intval', str_split($combo));
    };
    $gen();
    return $vectors;
}

function matrix_rotate(array $matrix, array $vector) {
   foreach ($matrix as $row => &$values) {
       array_rotate($values, $vector[$row]);
   }
   return $matrix;
}

function matrix_brute_solve(array $matrix) {
    matrix_make_square($matrix);
    foreach (matrix_generate_vectors($matrix) as $vector) {
        $attempt = matrix_rotate($matrix, $vector);
        if (matrix_is_solved($attempt))
            return matrix_display($attempt);
    }
    echo 'No solution';
}

function array_rotate(array &$array, $offset) {
    foreach (array_slice($array, 0, $offset) as $key => $val) {
        unset($array[$key]);
        $array[$key] = $val;
    }
    $array = array_values($array);
}

function matrix_display(array $matrix = null) {
    echo "[\n";
    foreach ($matrix as $row => $inner) {
        echo "  $row => ['" . implode("', '", $inner) . "']\n";
    }
    echo "]\n";
}

function matrix_make_square(array &$matrix) {
    $pad = count(array_keys($matrix));
    foreach ($matrix as &$row)
        $row = array_pad($row, $pad, '');
}

$tests = [
[ ['X'], ['X'] ],
[ ['X'], ['X'], ['X'] ],
[ [ 'X', '' ], [ '', 'X' ] ],
[ ['X', 'Y', 'Z'], ['X', 'Y'], ['X']],
[ ['X', 'Y'], ['X', 'Y'], ['X', 'Y'] ]
];
array_map(function ($matrix) {
    matrix_display($matrix);
    echo "solved by:" . PHP_EOL;
    matrix_brute_solve($matrix);
    echo PHP_EOL;
}, $tests);

And this works perfectly on small array, but trouble is this iterating over all possibilities of array movements, and for array like 6x6 this is just too much to compute - O(nn) in both time and space!


Solution

  • After quite some fiddling, I came up with the following code.

    The idea is to identify conflicting elements and swap them to a column where they are no longer a problem. For cases where this is not applicable a random selection is done. The code works recursive and thus there are edge-cases where it takes very long to complete.

    An extreme edge-case is an input where all rows consist of exactly the same values.

    <?php
    declare(strict_types=1);
    
    final class SwapSolver
    {
        /**
         * @param array $input
         *
         * @return array
         */
        public function solve(array $input): array
        {
            $input = array_values($input);
    
            return $this->swapDuplicates($this->prepare($input, $this->getMinRowLength($input)));
        }
    
        /**
         * @param array $input
         *
         * @return array
         */
        private function swapDuplicates(array $input): array
        {
            $unswappable = [];
    
            foreach ($this->duplicates($input) as $position) {
                list($r, $a) = $position;
    
                $swapped = false;
                foreach ($this->swapCandidates($input, $r, $a, true) as $b) {
                    $input[$r] = $this->swap($input[$r], $a, $b);
                    $swapped = true;
                    break;
                }
    
                if (!$swapped) {
                    $unswappable[] = $position;
                }
            }
    
            // still unswappable
            $unswappable = array_values(array_filter($unswappable, function (array $position) use ($input): bool {
                return $this->isDuplicate($input, ...$position);
            }));
    
            // tie breaker
            if (count($unswappable) > 0) {
                list($r, $a) = $unswappable[mt_rand(0, count($unswappable) - 1)];
    
                $candidates = [];
                foreach ($this->swapCandidates($input, $r, $a, false) as $b) {
                    $candidates[] = $b;
                }
    
                $input[$r] = $this->swap($input[$r], $a, $candidates[mt_rand(0, count($candidates) - 1)]);
    
                return $this->swapDuplicates($input);
            }
    
            return $input;
        }
    
        /**
         * @param array $input
         *
         * @return \Generator
         */
        private function duplicates(array &$input): \Generator
        {
            foreach ($input as $r => $row) {
                foreach ($row as $c => $value) {
                    if ($this->isDuplicate($input, $r, $c)) {
                        yield [$r, $c];
                    }
                }
            }
        }
    
        /**
         * @param array $input
         * @param int   $row
         * @param int   $column
         *
         * @return bool
         */
        private function isDuplicate(array $input, int $row, int $column): bool
        {
            $candidate = $input[$row][$column];
    
            if (is_null($candidate)) {
                return false;
            }
    
            foreach (array_column($input, $column) as $r => $value) {
                if ($r !== $row && $value === $candidate) {
                    return true;
                }
            }
    
            return false;
        }
    
        /**
         * @param array $input
         * @param int   $row
         * @param int   $column
         * @param bool  $strict
         *
         * @return \Generator
         */
        private function swapCandidates(array &$input, int $row, int $column, bool $strict): \Generator
        {
            foreach ($input[$row] as $c => $dst) {
                if ((!$strict || !in_array($input[$row][$column], array_column($input, $c), true))
                    && (is_null($dst) || !in_array($dst, array_column($input, $column), true))
                ) {
                    yield $c;
                }
            }
        }
    
        /**
         * @param array $row
         * @param int   $from
         * @param int   $to
         *
         * @return array
         */
        private function swap(array $row, int $from, int $to): array
        {
            $tmp = $row[$to];
            $row[$to] = $row[$from];
            $row[$from] = $tmp;
    
            return $row;
        }
    
        /**
         * @param array $input
         * @param int   $padSize
         *
         * @return array
         */
        private function prepare(array $input, int $padSize): array
        {
            return array_map(function (array $row) use ($padSize): array {
                $row = array_pad($row, $padSize, null);
                shuffle($row);
    
                return $row;
            }, $input);
        }
    
        /**
         * @param array $input
         *
         * @return int
         */
        private function getMinRowLength(array $input): int
        {
            return max(
                ...array_values(array_count_values(array_merge(...$input))),
                ...array_map('count', $input)
            );
        }
    }
    

    Usage:

    <?php
    $solver = new SwapSolver();
    $solution = $solver->solve($input);