Search code examples
phparrayssortingcomputer-sciencesingly-linked-list

Sorting loosely connected data reminiscent of a singly linked list


I am looking for an efficient solution to sorting linked data.

This is the context:

A program returns an array of objects each containing two variables: id and previous_id. The id is an unique identifier and the previous_id refers to another id of an object in the array. The order of the objects in the array is random.

The obvious ordering is of course this: In the sorted array the first object is the one where the previous_id is null. The next object in the array should always be the object whose previous_id corresponds with the id of the current object, or it is the last object.

What is an efficient algorithm to sort these variables in the above mentioned ordering?

My intuition suggests there is a simple way to sort these variables. The way these objects are represented reminds me of a singly linked list. Yet I can't seem to find an elegant solution to this.

My current algorithm is overly complex and works like this:

  • First we prepare the array for sorting by building a hash map of arrays. The hashmap is built by iterating over the array and mapping the key to the id of the current object and the value to the current object, wrapped in an array of size 1.

The goal of these arrays is to always have the correct ordering,
except the first element, which can have a previous_id that may point to an element which is not in it's own array, or be null.

The key pointing to this array in the hash map is always the id of the last element in the array.

  • We then keep looping over the hashmap until the length of the hashmap is less or equal to 1.

  • In the loop we iterate over the the hashmap.

  • If the key of the current hash item is not null, we take the first element of the array which we will call head. We also take the last element of the array, which we call butt.

  • We can now merge two arrays into one by taking the array found in the hash map whose key is equal to the head's previous_id with current array in the iteration and deleting the found array in the hashmap.

The pseudocode is like this:

$tbr = $this->prepareDataForSorting();
while (count($tbr) > 1) {
    foreach ($tbr as $key => $children) {
        $head = array_first($children);
        if ($head->previousId !== null) {
            $butt = array_last($children);
            $tbr[$butt->id] = array_merge($tbr[$head->previousId], $children);
            unset($tbr[$head->previousId]);
        }
    }
}
return $tbr[array_key_first($tbr)];

Where the prepare for sorting procedure is implemented like this:

$tbr = [];
foreach ($this->children as $child) {
    $tbr[$child->id] = [$child];
}
return $tbr;

The problem I have with this algorithm is that is not not lightweight at all. It requires hash maps as an auxiliary data structure as well as an O(n^2) algorithm. I feel there should be a much simpler way.

The answer has been posted below. I decided to include it's php implementation for future visitors:

$mapping = $this->mapData();
$current = $mapping[null] ?? null;
$tbr = [];
while ($current !== null) {
    $tbr[] = $current;
    $current = $mapping[$current->id] ?? null;
}
return $tbr;

With map data:

$tbr = [];
foreach ($this->children as $child) {
    $tbr[$child->previousId] = [$child];
}
return $tbr;
  • simply beautiful

Solution

  • You can achieve what you want in better and more simpler way. My suggestion is as follows:

    • Create a hash map
    • Iterate over array and for each and every entry in the array, create an entry in hash map whose key and value are previous_id and current entry in the iteration, respectively.
    • Finally, you need to get hash value of key null (because your first item's previous_id is null), push its value to final array, and use the id of current entry as new key for hashmap.
    • Do the last step until the length of final array be the same as original array.

    Complexity of this approach is O(n) which is very effecient in comparision with O(n^2)