Search code examples
phpoptimizationmicro-optimization

How to properly increment some array key, even if key needs to be created?


Suppose you need to create a 'top' of some sort and have code like this:

$matches=array();
foreach ($array as $v){
   $matches[processing($v)]++;  
}

This will output a Notice: Undefined index for the cases the index needs creating.

What would be the best way to tackle these cases since you KNOW you'll have to create indexes?

I used these solutions depending on case:

  1. Suppressing the error @$matches[$v]++;
    Pro: VERY easy to type
    Con: slow
  2. Checking if it's set $matches[$v]=isset($matches[$v])?$matches[$v]++:1;
    Pro: faster
    Con: takes longer to write even in the shorthand form and need to use $matches[$v] 2 more times

Are there any other ways?
Looking for the fastest execution time as I'm using this function thousands of times or some lazier way to type that's still faster than @

EDIT:

In a simple case where you have $matches[$v]++; you could also use array_count_values() (as Yoshi suggested)


Solution

  • After some reading, writing and testing I got something:

    function inc(&$var){
        if (isset($var)) $var++;else $var=1;
    }
    

    and thought I struck gold, but let's see the tests first...

    Test code:

    $a=array();
    
    // Pre-Fill array code goes here
    for($i=1;$i<100000;$i++) {
        $r=rand(1,30000);
        //increment code goes here
    }
    
    // Remove extra keys from array with:
    //foreach ($a as $k=>$v) if ($v==0) unset($a[$k]);
    

    Execution times: (for informative purposes only)

    inc($a[$r])                             1.15-1.24
    @$a[$r]++                                   1.03-1.09
    $a[$r]=array_key_exists($r,$a)?$a[$r]++:1;  0.99-1.04
    
    $a[$r]=!empty($a[$r])?$a[$r]++:1;               0.61-0.74
    if (!empty($a[$r])) $a[$r]++;else $a[$r]=1; 0.59-0.67
    $a[$r]=isset($a[$r])?$a[$r]++:1;                0.57-0.65
    if (isset($a[$r])) $a[$r]++;else $a[$r]=1;  0.56-0.64
    
    
    //with pre-fill
    $a=array_fill(0,30000,0);                   +0.07(avg)
    for($i=1;$i<=30000;$a[$i++]=0);             -0.04(avg)
    
    //with pre-fill and unset
    $a=array_fill(0,45000,0);                   +0.16(avg)
    for($i=1;$i<=45000;$a[$i++]=0);             +0.02(avg)
    

    Conclusions:

    • @ is of course the fastest to type and I don't see any problem in using it in this case but feel free to check this question also: Suppress error with @ operator in PHP
    • completely suppressing errors (before the loop and enabling errors after the loop) via ini_set() is worse than all on performance
    • inc() looks nice and clean, easy to type and does checking instead of suppressing, but calling it looks to be even slower than @
    • isset() is slightly faster than empty(), but both perform fairly the same
    • interestingly using shorthand if statements is slightly slower!
    • best results achieved when pre-filling the array. Even if length is unknown a good prediction would be still slightly faster on a huge dataset
    • strangely, array_fill() takes slightly longer than for ?!?!

    RFC

    I don't consider this answer 100% complete, although, for now it looks like isset() is the fastest and @ the laziest.
    Any comments and ideas are appreciated!