Search code examples
algorithmsortingarray-algorithms

How does Sort algorithm performance vary with repeated elements in data


Does the performance of a sorting algorithms(Merge,Quick) reduce with repeated elements for the same sized array.

can I say

var arr = [6,4,7,1,8,3,9,2,10,5]; --> Sorts faster because of unique elements
var arr = [1,1,3,3,3,2,4,4,5,10]; --> Slower because repeated elements 

I tried to write a simple quicksort implementation and it gives me the above mentioned behavior . Am I doing something wrong

var Quicksort = function(arr, si, ei) {

    if (si >= ei) return;

    var pivot = arr[ei];
    var pi = si;

    for (var i = si; i < ei; i++) {

        if (arr[i] < pivot) {
            Swap(arr, pi, i);
            pi++;
        }
    }


    Swap(arr, pi, ei);

    Quicksort(arr, 0, pi - 1);
    Quicksort(arr, pi + 1, ei);

    return arr;

}

function Swap(arr, i, j) {
    var temp = arr[i];
    arr[i] = arr[j];
    arr[j] = temp;
}

// Execute 
var input = [];
for (var i = 0; i < 1000; i++) {
    input.push(Math.floor(Math.random() * 100000));
}
console.log(input);

var result = Quicksort(input, 0, input.length - 1);

console.log(result);

Thanks


Solution

  • The number of equal or different values does not affect the efficiency of either merge or quicksort.

    what affects the efficiency of quicksort is how you choose the pivot value and if the array is already sorted.

    read: http://www.geeksforgeeks.org/when-does-the-worst-case-of-quicksort-occur/

    when it comes to mergesort, what affects its efficiency is how many comparisons your end up doing during the merge phase.

    read: When will the worst case of Merge Sort occur?

    Therefore, the frequency of numbers does not affect the sorting algorithms.

    of course, assuming you have an array of size n, where all values are equal to x, quicksort will perform well, since the array will already be sorted.

    like: [6,6,6,6,6], quicksort would performe in n².

    I believe that there are other sorting algorithms that would do a better job for arrays where values repeat a lot, I would say counting sort (http://www.geeksforgeeks.org/counting-sort/) would be a good approach.