Search code examples
c++algorithmsortingoptimizationbubble-sort

Optimizing bubble sort - What am I missing?


I'm trying to understand possible optimization methods for the bubble sort algorithm. I know there are better sorting methods, but I'm just curious.

To test the efficiency I'm using std::chrono. The program sorts a 10000 number long int array 30 times and prints the average sorting time. The numbers are picked randomly(up to 10000) in every iteration. Here is the code, with no optimization:

#include <iostream>
#include <ctime>
#include <chrono>
using namespace std;



int main() {


    //bubble sort
    srand(time(NULL));

    chrono::time_point<chrono::steady_clock> start, end;

    const int n = 10000;
    int i,j, last, tests = 30,arr[n];
    long long total = 0;
    bool out;

    while (tests-->0) {


        for (i = 0; i < n; i++) {
            arr[i] = rand() % 1000;
        }


        j = n;

        start = chrono::high_resolution_clock::now();
        while(1){

            out = 0;
            for (i = 0; i < j - 1; i++) {

                if (arr[i + 1] < arr[i]) {
                    swap(arr[i + 1], arr[i]);
                    out = 1;
                }
            }

            if (!out) {
                    break;
            }

            //j--;

        }


        end = chrono::high_resolution_clock::now();

        total += chrono::duration_cast<chrono::nanoseconds>(end - start).count();
        cout << "Remaining :"<<tests << endl;

    }

    cout << "Average  :" << total / static_cast<double>(30)/1000000000<<" seconds"; // tests(30)  + nanosec -> sec


    cin.sync();
    cin.ignore();
    return 0;
}

I get 0.17 seconds average sorting time.

If I uncomment line 47(j--;) to avoid comparing numbers already sorted I get 0.12 sorting time which is understandable.

If I remember the last position where a swap took place, I know that after that index, elements are sorted, and can thus sort up to that position in further iterations. It's better explained in the second part of this post: https://stackoverflow.com/a/16196115/1967496. This is the code that implements the new possible optimization:

#include <iostream>
#include <ctime>
#include <chrono>
using namespace std;



int main() {


    //bubble sort
    srand(time(NULL));

    chrono::time_point<chrono::steady_clock> start, end;

    const int n = 10000;
    int i,j, last, tests = 30,arr[n];
    long long total = 0;
    bool out;

    while (tests-->0) {


        for (i = 0; i < n; i++) {
            arr[i] = rand() % 1000;
        }


        j = n;

        start = chrono::high_resolution_clock::now();
        while(1){

            out = 0;
            for (i = 0; i < j - 1; i++) {

                if (arr[i + 1] < arr[i]) {
                    swap(arr[i + 1], arr[i]);
                    out = 1;
                    last = i;
                }
            }

            if (!out) {
                    break;
            }

            j = last + 1;

        }


        end = chrono::high_resolution_clock::now();

        total += chrono::duration_cast<chrono::nanoseconds>(end - start).count();
        cout << "Remaining :"<<tests << endl;

    }

    cout << "Average  :" << total / static_cast<double>(30)/1000000000<<" seconds"; // tests(30)  + nanosec -> sec


    cin.sync();
    cin.ignore();
    return 0;
}

Note lines 40 and 48. And here comes the problem: The average time is now again around 0.17 seconds. Is there a problem in my code, or am I missing something ?

Update:

I did sorting with 10 times more numbers and get now following results: No optimization: 19.3 seconds First optimization(j--): 14.5 seconds Second (supposed) optimization(j=last+1): 17.4 seconds;

From my understanding, the second method should be in any case better than the first, but the numbers tell something else.


Solution

  • Well... The problem is that there might not be the right or wrong answer to this question.

    First of all, when you're comparing only 10000 elements, you cannot really call it an effeciency test. Try comparing much higher number of elements - maybe 500000 (although you will probably need to alocate an array dynamicaly for that).

    Second of all, it might be the compiler. Compilers often try to optimize things so that the program execution will run smoother and faster.