Search code examples
pythonalgorithmperformancebubble-sort

Is this most efficient to bubble sort a list in python?


I'm trying to see if this is the most efficient way to sort a bubble list in python or if there are better ways some people tell me to use two loops, what are the benefits of doing like that vs the below

def sort_bubble(blist):
    n = 0
    while n < len(blist) - 1:
        if blist[n] > blist[n + 1]:
            n1 = blist[n]
            n2 = blist[n + 1]
            blist[n] = n2
            blist[n + 1] = n1
            n = 0
        else:
            n = n + 1
    print blist

Solution

  • Your algorithm is technically a bubble sort in that it does exactly the swaps that it should. However, it's a very inefficient bubble sort, in that it does a lot more compares than are necessary.

    How can you know that? It's pretty easy to instrument your code to count the number of compares and swaps. And meanwhile, Wikipedia gives implementations of a simple bubble sort, and one with the skip-sorted-tail optimization, in a pseudocode language that's pretty easy to port to Python and similarly instrument. I'll show the code at the bottom.

    For a perfect bubble sort, given a random list of length 100, you should expect a bit under 10000 compares (100 * 100), and a bit under 2500 swaps. And the Wikipedia implementation does exactly that. The "skip-sorted-tail" version should have just over half as many compares, and it does.

    Yours, however, has 10x as many compares as it should. The reason your code is inefficient is that it starts over at the beginning over and over, instead of starting where it swapped whenever possible. This causes an extra factor of O(sqrt(N)).

    Meanwhile, almost any sort algorithm is better than bubble sort for almost any input, so even an efficient bubble sort is not an efficient sort.


    I've made one minor change to your code: replacing the four-line swap with a more idiomatic single-line swap. Otherwise, nothing is changed but adding the cmpcount and swapcount variables, and returning the result instead of printing it.

    def bogo_bubble(blist):
        cmpcount, swapcount = 0, 0
        n = 0
        while n < len(blist) - 1:
            cmpcount += 1
            if blist[n] > blist[n + 1]:
                swapcount += 1
                blist[n], blist[n+1] = blist[n+1], blist[n]
                n = 0
            else:
                n = n + 1
        return blist, cmpcount, swapcount
    

    This is the Psuedocode implementation from Wikipedia, translated to Python. I had to replace the repeat… unit with a while True… if not …: break, but everything else is trivial.

    def wp1_bubble(blist):
        cmpcount, swapcount = 0, 0
        while True:
            swapped = False
            for i in range(1, len(blist)):
                cmpcount += 1
                if blist[i-1] > blist[i]:
                    swapcount += 1
                    blist[i-1], blist[i] = blist[i], blist[i-1]
                    swapped = True
            if not swapped:
                break
        return blist, cmpcount, swapcount
    

    This is the Optimizing bubble sort, which does the simple version of the skip-sorted-tail optimization, but not the more elaborate version (which comes right after it).

    def wp2_bubble(blist):
        cmpcount, swapcount = 0, 0
        n = len(blist)
        while True:
            swapped = False
            for i in range(1, n):
                cmpcount += 1
                if blist[i-1] > blist[i]:
                    swapcount += 1
                    blist[i-1], blist[i] = blist[i], blist[i-1]
                    swapped = True
            n -= 1
            if not swapped:
                break
        return blist, cmpcount, swapcount
    
    import random
    alist = [random.randrange(100) for _ in range(100)]
    bb, cb, sb = bogo_bubble(alist[:])
    b1, c1, s1 = wp1_bubble(alist[:])
    b2, c2, s2 = wp2_bubble(alist[:])
    assert bb == b1 == b2
    print('bogo_bubble: {} cmp, {} swap'.format(cb, sb))
    print('wp1_bubble : {} cmp, {} swap'.format(c1, s1))
    print('wp2_bubble : {} cmp, {} swap'.format(c2, s2))
    

    Typical output:

    bogo_bubble: 100619 cmp, 2250 swap
    wp1_bubble : 8811 cmp, 2250 swap
    wp2_bubble : 4895 cmp, 2250 swap