This is my code for bucket sort in Python.
from random import randrange
def insertion_sort(aList):
for i in range(1, len(aList)):
for j in range(i, 0, -1):
if aList[j] < aList[j-1]:
aList[j], aList[j-1] = aList[j-1], aList[j]
return aList
def bucket_sort(aList):
buckets = [[]] * len(aList)
for index, value in enumerate(aList):
buckets_index = value * len(aList) // (max(aList) + 1)
buckets[buckets_index].append(value)
answer = []
for bucket in buckets:
answer.extend(insertion_sort(bucket))
# answer += insertion_sort(bucket)
print(buckets[0])
print("\n")
# return answer
aList = [randrange(10) for _ in range(100)]
print(aList)
print("\n")
answer = bucket_sort(aList)
#print(answer)
What is happening? When I run the code, I always find that the first list in buckets is already sorted and the other lists in buckets are all copies of it. Do I need the insertion sort for each list? What would I use the "answer" variable for?!
I'm mainly relying on this visualization.
One thing that i notice right off the bat is that you initialize your variable buckets as buckets = [[]] * len(aList)
. This makes a list of identical copies of the empty list. As such, any modification of this list is replicated in every element of buckets
. Change this line to:
buckets = [[] for _ in xrange(len(aList))]
To check if the lists inside the list are separate object, you could check their id's:
print [id(x) for x in buckets]
This should print a list of unique numbers.