Search code examples
phpgraphheapspl

Is there a way to make PHP's SplHeap recalculate? (aka: add up-heap to SplHeap?)


I am using an SplHeap to hold graph nodes of a tree with directed edges that will be traversed from the leaves to the root. For this, I precalculate the "fan-in" of nodes and put them into the heap so that I can always retrieve the node with the smallest fan-in (0) from it.

After visiting a node, I reduce the fan-in of its successor by 1. Then obviously, the heap needs to be recalculated because the successor is now in the wrong place there. I have tried recoverFromCorruption(), but it doesn't do anything and keeps the heap in the wrong order (node with larger fanIn stays in front of smaller fanIn).

As a workaround, I'm now creating a new heap after each visit, amounting to a full O(N*log(N)) sort each time.

It should be possible, however, to make up-heap operations on the changed heap entry until it's in the right position in O(log(N)).

The API for SplHeap doesn't mention an up-heap (or deletion of an arbitrary element - it could then be re-added). Can I somehow derive a class from SplHeap to do this or do I have to create a pure PHP heap from scratch?

EDIT: Code example:

class VoteGraph {
    private $nodes = array();

    private function calculateFanIn() { /* ... */ }

    // ...

    private function calculateWeights() {
        $this->calculateFanIn();
        $fnodes = new GraphNodeHeap(); // heap by fan-in ascending (leaves are first)

        foreach($this->nodes as $n) {
            // omitted: filter loops
            $fnodes->insert($n);
        }

        // traversal from leaves to root
        while($fnodes->valid()) {
            $node = $fnodes->extract(); // fetch a leaf from the heap
            $successor = $this->nodes[$node->successor];
            // omitted: actual job of traversal
            $successor->fanIn--; // will need to fix heap (sift up successor) because of this

            //$fnodes->recoverFromCorruption(); // doesn't work for what I want
            // workaround: rebuild $fnodes from scratch
            $fixedHeap = new GraphNodeHeap();
            foreach($fnodes as $e)
                $fixedHeap->insert($e);
            $fnodes = $fixedHeap;
        }
    }
}

class GraphNodeHeap extends SplHeap {
    public function compare($a, $b) {
        if($a->fanIn === $b->fanIn)
            return 0;
        else
            return $a->fanIn < $b->fanIn ? 1 : -1;
    }
}

Complete code also available: https://github.com/md2k7/civicracy/blob/master/civi-php/protected/components/VoteGraph.php#L73

EDIT 2:

$this->putNode(new GraphNode(4));
$this->putNode(new GraphNode(1, 2));
$this->putNode(new GraphNode(3, 2));
$this->putNode(new GraphNode(2, 4));

This means user 1 and user 3 are voting for user 2, and user 2 is voting for user 4, passing on 3 votes (2 received + his/her own). This is called delegative voting: my algorithm is passing on votes "from the bottom" (the leaves) where I already know how much weight (responsibility/representation/as you like it...) each user has.


Solution

  • I was solving very similar problem recently, it seems like SPL doesn't support updates. So

    I had to write my own heap.

    It isn't extra efficient, but it does what I need and it's much faster than sorting an array repeatedly... SPL heap is still much faster though...

    here it is...

    class heap
    {
        public $members=array();
    
        // these two are just for statistics
        private $swaps=0; 
        private $recurs=array('lups'=>0, 'ldowns'=>0);
    
        public function insert($val){
    
            if(is_array($val) && empty($this->members)){ // because heapify is (in theory) more efficient
                foreach($val as $v){
                    $this->members[]=$v;
                }
                $this->heapify();
            }else{
                $emptyPosition=count($this->members);  // count(members) gets index of first empty position, not last key
                $this->members[]=$val; // puts $val in
                $this->ladderup($emptyPosition);
            }
        }
    
        public function heapify(){
        /* in case all the heap is broken, we can always use this to repair it.
         It should be more efficient to fill $members randomly and "repair" it with heapify after,
         than inserting things one by one*/
    
            $start=max(0, floor( (count($this->members)-1)/2)); // find last parent
            for($i=$start;$i>=0;$i--){
                $this->ladderdown($i);
            }
        }
    
        private function ladderdown($index){
        // recursively sifts down $index
            $this->recurs['ldowns']++;
    
            /*
            indexes of children
            they are stored at  parent_position*2 and parent_position*2+1
            but becouse php uses null-based array indexing, we have to modify it a little
            */  
            $iA=$index*2+1; 
            $iB=$index*2+2;
    
            if($iA<count($this->members)){ // check if children exist
                if($iB<count($this->members)){
                    if($this->compare($iA, $iB)>=0) $bigger=$iA; // if both exist, compare them, cause we want to swap with the bigger one ; I'm using ">=" here, that means if they're equal, left child is used
                    else $bigger=$iB;
                }else{
                    $bigger=$iA; // if only one children exists, use it
                }
    
                if($this->compare($bigger, $index)>0){ // not using ">=" here, there's no reason to swap them if they're same
                    $this->swap($bigger, $index);
                    $this->ladderdown($bigger); // continue with $bigger because that's the position, where the bigger member was before we swap()ped it 
                }
            }
        }
    
        private function ladderup($index){
        // sift-up, 
            $this->recurs['lups']++;
    
            $parent=max(0, floor( ($index-1)/2)); // find parent index; this way it actualy swaps one too many times: at the end of sift-up-ing swaps the root with itself
            if($this->compare($index, $parent)>0){
                $this->swap($index, $parent);
                $this->ladderup($parent);
            }
        }
    
        public function root(){
            if(count($this->members)){
                return $this->members[0];
            }
            return false;   
        }
    
        public function extract(){
        // removes and returns root member
            if(!count($this->members)) return false;
    
            $this->swap(0,count($this->members)-1); // swaps root with last member
            $result=array_pop($this->members); // removes last member (now root)
            $this->ladderdown(0); // root is on wrong position, sifts it down
            return $result;
        }
    
        public function update($index, $value){
            if($index<count($this->members)){
                $this->members[$index]=$value;
                $this->ladderup($index);
                $this->ladderdown($index);
            }
        }
    
        public function delete($index){
        // removes index from heap the same way as root is extracted
            $this->swap(count($this->members)-1, $index); // swaps index with last one
            array_pop($this->members);
            $this->ladderup($index);
            $this->ladderdown($index);
        }
    
        private function swap($iA, $iB){
        // swaps two members
            $this->swaps++;
    
            $swap=$this->members[$iA];
            $this->members[$iA]=$this->members[$iB];
            $this->members[$iB]=$swap;
        }
    
        private function compare($iA, $iB){
            $result=$this->members[$iA] - $this->members[$iB];
            return $result;
        }
    
        public function stats($text=""){
         // prints and resets statistics
            echo "STATS: $text... Sift-ups: ".$this->recurs['lups']." Sift-downs: ".$this->recurs['ldowns']." Swaps: ".$this->swaps." <br>";
            $this->recurs=array('lups'=>0, 'ldowns'=>0);
            $this->swaps=0;
        }
    }
    
    //here's how to use it...
    
    $h=new heap;
    
    for($i=0; $i<10000; $i++){
        $h->insert(rand(1,1000));
    }
    $h->stats("after inserting one-by-one");
    
    while($biggest=$h->extract()); // note that $h->extract might return FALSE, but might return zero as well, if there was zero in the heap
    
    $h->stats("after extracting all roots (like in heapsort)");
    
    echo "Now, heap is empty. Let's try whole array at once <br>";
    
    for($i=0; $i<10000; $i++){
        $a[]=rand(1,1000);
    }
    $h->insert($a); // inserting whole array here, so heap will use more efficient heapify()
    $h->stats("after heapify");
    
    echo "let's update two indexes<br>";
    
    $h->update(1234,44444);// sure on top
    $h->stats("after update");
    $h->update(8888,40000);// second place
    $h->stats("after update");
    
    echo "extract biggest three indexes<br>";
    
    echo $h->extract()." - this should be 44444<br>";
    echo $h->extract()." - this should be 40000<br>";
    echo $h->extract()." - this should be biggest number given by rand(1,1000)<br>";
    
    $h->stats("after three extracts");
    
    while($h->extract());
    $h->stats("after extracting the rest");
    

    and result is:

    STATS: after inserting one-by-one... Sift-ups: 22651 Sift-downs: 0 Swaps: 12651
    STATS: after extracting all roots (like in heapsort)... Sift-ups: 0 Sift-downs: 116737 Swaps: 116737
    Now, heap is empty. Let's try whole array at once
    STATS: after heapify... Sift-ups: 0 Sift-downs: 12396 Swaps: 7396
    let's update two indexes
    STATS: after update... Sift-ups: 11 Sift-downs: 1 Swaps: 10
    STATS: after update... Sift-ups: 13 Sift-downs: 1 Swaps: 12
    extract biggest three indexes
    44444 - this should be 44444
    40000 - this should be 40000
    1000 - this should be biggest number given by rand(1,1000)
    STATS: after three extracts... Sift-ups: 0 Sift-downs: 42 Swaps: 42
    STATS: after extracting the rest... Sift-ups: 0 Sift-downs: 116652 Swaps: 116652

    You will have to modify it a bit, but anyway, hope it helps..