Search code examples
c++algorithmtime-complexityquicksort

quicksort with random data causes N^2 complexity graph


I am trying to plot the theoretical time complexity with measured values for quicksort. The plots for quicksort look good when I input increasing, decreasing or constant data series (they follow N^2 nicely). The problem arises when I enter a random input stream.

For quicksort given random data, the time complexity is n log n, but my data points suggest that it's N^2. I have tried applying different implementations of the partitioning, and I've made sure that the elements get sorted.

Here is my implementation of quicksort and the partition step:

template<typename BidirectionalIterator, typename Compare>
BidirectionalIterator partition_impl(BidirectionalIterator first,
BidirectionalIterator last, Compare cmp) {
    auto pivot = std::prev(last, 1);
    if (first == pivot) return first;
    while (first != last) {
        while (cmp(*first, *pivot)) { ++first; } // search greater > pivot
        do { --last; } while (!cmp(*last, *pivot) && last >= first); // search < pivot
        if (first > last) {     // if left iterator has surpassed right iterator
            std::iter_swap(first, pivot);
            return first;
        }
        if (*first > *last) {
            std::iter_swap(first, last);
            ++first;
        }
    }
    std::iter_swap(first, pivot);
    return first;
}

template <typename FwdIt, typename Comp = std::less<>>
void quick_sort(FwdIt first, FwdIt last, Comp cmp = Comp{}) {
    if (std::distance(first, last) <= 1) return;

    FwdIt split = partition_impl(first, last, cmp);

    quick_sort(first, split, cmp);  // sorts the left partition
    quick_sort(split + 1, last, cmp);   // sorts the right partition
}

So my question: is there anything in my implementation that indicates what the problem may be?

This is what my graph looks like: enter image description here

EDIT: Some additional code:

void measure(std::vector<int>& ds) {
    Timer::start();
    algo_ptr(ds.begin(), ds.end(), std::less<>());
    Timer::stop();
}
void collectSamples(data_series_t& data_series) {
    for (int k = 0; k < NUM_OF_SAMPLES; ++k) {
        auto ds = std::get<1>(data_series)(x_measurement_point);
        measure(ds);
        measurements.push_back(Timer::elapsedTime());
    }
}

The algo_ptr goes directly to my implementation above. For every data series, I collect 10 samples, on each iteration I make sure to generate a new random data series such that I don't sort already sorted elements.


Solution

  • Thanks everyone for trying to help. I really appreciate it. I think I have found the solution.

    The problem was that I had only allowed 50 different random values to occur. This is way too small of a sample size of random values, since I am measuring much more elements than that.

    This essentially caused the pivot to not be randomly chosen, which made my quick sort algorithm deteriorate into N^2.

    In hindsight this seems like such an obvious thing I should have realized, but I am still a beginner with DSA.

    After fixing this and making sure that I have several million different possible random values, the graph looks like this:

    enter image description here