Search code examples
phparraysmultidimensional-arrayquery-performance

array_merge VS direct array injection performance


Is there any difference in performance or in any aspect between these?

$a = ['a' => 1, 'b' => 2, 'c' => 3];
$b = ['d' => 4, 'e' => 5, 'f' => 6];
$c = array_merge($a, $b);

VS

$a = [];
$a['a'] = 1;
$a['b'] = 2;
$a['c'] = 3;

$b = [];
$b['d'] = 4;
$b['e'] = 5;
$b['f'] = 6;
$c = array_merge($a, $b);

VS

$a = [];
$a = ['a' => 1, 'b' => 2, 'c' => 3];
$a['d'] = 4;
$a['e'] = 5;
$a['f'] = 6;

Solution

  • First, micro-optimizations like this are usually pointless unless you have a huge amount of requests to process or a very large data set. That being said...

    Option 1 and option 2 should give roughly the same performance. However, the first option will be slightly faster because there is no need to dynamically expand array A and array B, which IS required in the second example.

    However, both of the first two examples are using array_merge(), which introduces the overhead of actually making a function call and there is still the check to see if the key actually exists or not. Note that array_merge overwrites the element associated with a string key when that key already exists. Remember this is not done in the case of elements with numeric keys, however. In that case, there is no such check and nothing is overwritten; the keys and elements are simply appended to the end of the target array. Here is the PHP doc explanation:

    If the input arrays have the same string keys, then the later value for that key will overwrite the previous one. If, however, the arrays contain numeric keys, the later value will not overwrite the original value, but will be appended. Values in the input arrays with numeric keys will be renumbered with incrementing keys starting from zero in the result array. https://www.php.net/manual/en/function.array-merge.php

    Of course, the advantage of using array_merge() is that you can tell at a glance what it does.

    Here is my benchmark which compares directly inserting items of $b into $a vs. merging $a and $b to create a new array using array_merge(). There are 1 million items in both initial arrays.

    <?php
    
    
    $a = [];
    $b = [];
    
    /*Insert 1000000 elements into array $a with string keys starting at '0' and ending at '999999'*/
    for ($i = 0; $i < 1000000; $i++)
    {
      $a["{$i} "] = $i;
    }
    
    /*Insert 1000000 elements into array $b with string keys starting at '1000000' and ending at '1999999' */
    for ($j = 1000000; $j < 2000000; $j++)
    {
      $b["{$j} "] = $j;
    }
    
    
    $temp = $a;
    
    /*Inserting the values of $b into $temp in a loop*/
    $start = microtime(true);
    foreach($b as $key => $current)
    {
       $temp[$key] = $current;  
    }   
    
    $end = microtime(true);
    $runtime = $end - $start;
    $output =  "<p>Inserted elements of array a and b with assignment in %.10f ({$runtime}) seconds</p>";
    
    echo sprintf($output, $runtime);
    
    
    
    /*Using array_merge to merge $a and $b */   
    $start = microtime(true);
    
    $c = array_merge($a, $b);
    
    $end = microtime(true);
    
    
    $runtime = $end - $start;
    $output =  "<p>Merged array a and b with array_merge() in %.10f  ({$runtime}) seconds </p>";
    
    echo sprintf($output, $runtime);
    

    Output:

    Inserted elements of array a and b with assignment in 0.1125514507 (0.11255145072937) seconds

    Merged array a and b with array_merge() in 0.0289690495 (0.028969049453735) seconds.

    I modified the benchmark so it uses a temporary array in the assignment test, so that $a and $b are never modified. The difference in running times were still there, however. To get a good average, I ran it 1000 times, taking the averages of both running times. The final result was not much different from the initial run. The average time for the first approach was about 0.1012 seconds, while the merge_array() approach took 0.0574 seconds. That's a difference of around 0.0438 seconds or 43.8 ms.

    So you're looking at about a 57% difference in performance, on average. This is interesting, given that I found array_merge() to be slow in older versions of PHP. However, as of 7.3, it looks like array_merge() should be chosen over manually merging arrays with string keys.