Search code examples
pythonlistmergesort

python is shuffling my list


I'm trying to implement a merge sort algorithm using lists in Python. My merge code is fine (I'll not put it here right now, but if you ask me, I can happily post it), but when it returns to the recursive level above, the list is shuffled and I don't know why.

To illustrate, this is whats is happening:

array 1: [56]
array 2: [96]
merged array: [56, 96]    # the merge is fine

array 1: [76]
array 2: [73]
merged array: [73, 76]    # here too

Notice that the merge is really fine. But, notice also that the 'array 2' in the next section is shuffled (theoretically is the same list as the [73, 76] above)

array 1: [56, 96]
array 2: [76, 73]         # this is broken
merged array: [56, 76, 73, 96]

It occurs randomly through the recursive execution. Notice here that it occurs to the array 1, not array 2, showing off that the error isn't in my code:

array 1: [96]
array 2: [34]
merged array: [34, 96]

array 1: [19]
array 2: [34] 
merged array: [19, 34]

array 1: [96, 34]          # this is broken too
array 2: [19, 34]
merged array: [19, 34, 96, 34]

This way, I cannot ever order a list using this merge sort. If python's list is an ordered sequence, can anybody know why this is happening?

As asked, here's the code:

def merge_sort(array):
    if len(array) > 1:
        array1, array2 = partition(array)
        merge_sort(array1)
        merge_sort(array2)
        array = merge(array1, array2)
    return array

def partition(array):
    n = len(array) / 2
    array1 = array[0:n]
    array2 = array[n::]

    return array1, array2   

def merge(a, b):
    array = []
    while len(a) > 0 or len(b) > 0:
        if len(a) > 0 and len(b) > 0:
            if a[0] <= b[0]:
                array.append(a.pop(0))
            else:
                array.append(b.pop(0))
        elif len(a) > 0:
            array.append(a.pop(0))
        elif len(b) > 0:
            array.append(b.pop(0))
    return array

Solution

  • The merge sort is a divide-et-impera algorithm: you break the problem in different subproblems (the divide part) and then combine the subsolutions to generate the problem solution (the impera part).

    In the merge sort, the divide part is to order two sublists generated from your list. The impera part takes these two ordered sublists and merge them in an unique ordered list. When you invoke:

    merge([56, 96], [76, 73])
    

    This cannot work, because the merge function is your impera part, which expects the sublists to be ordered. Your code is not working because you are not keeping the ordered sublists, but you are throwing them away, since you are not sorting the array in-place. Instead of:

    merge_sort(array1)
    

    You should have:

    array1 = merge_sort(array1)
    

    Otherwise when you call:

    merge(array1, array2)
    

    The sublists are not ordered.