I have already made code for insertion sort and merge sort. Now I want to implement my insertion sort and merge sort into a Tim sort.
I can see that the example of Tim sort uses start, mid and end inputs but it should be possible to do this without no?
I would like to keep my merge and insertion sorts as they are if possible because of input outputs fits well with the rest of my code.
from random import randint
minrun = 32
def insertion_sort(in_data):
s_data = list(in_data)
for i in range(1, len(s_data)):
key = s_data[i]
j = i - 1
while j >= 0 and key < s_data[j]:
s_data[j + 1] = s_data[j]
j -= 1
s_data[j + 1] = key
return s_data
def merge(a, b):
c = []
a_idx, b_idx = 0, 0
while a_idx < len(a) and b_idx < len(b):
if a[a_idx] < b[b_idx]:
c.append(a[a_idx])
a_idx += 1
else:
c.append(b[b_idx])
b_idx += 1
if a_idx == len(a):
c.extend(b[b_idx:])
else:
c.extend(a[a_idx:])
return c
def merge_sort(a):
if len(a) <= 1:
return a
left, right = merge_sort(a[:len(a) // 2]), merge_sort(a[len(a) // 2:])
return merge(left, right)
def tim_sort(in_data):
n = len(in_data)
for start in range(0, n, minrun):
end = min(start + minrun - 1, n - 1)
in_data = insertion_sort(in_data, start, end)
curr_size = minrun
while curr_size < n:
for start in range(0, n, curr_size * 2):
mid = min(n - 1, start + curr_size - 1)
end = min(n - 1, mid + curr_size)
in_data = merge_sort(in_data, start, mid, end)
curr_size *= 2
return in_data
def create_array(size=10, max=50):
from random import randint
return [randint(0, max) for _ in range(size)]
I found this example of Tim sort but I struggle with how to make it work within my code.
27 November 2020
It is not clear where you obtained this example, but it certainly is not timsort. If you want to implement timsort, the first thing to do is read and understand Tim Peters description of the algorithm:
https://svn.python.org/projects/python/trunk/Objects/listsort.txt
This is the definitive document regarding timsort. You can find all kinds of rubbish with google. The only reference I have ever found that might be worth reading to get your feet wet is:
https://www.infopulse.com/blog/timsort-sorting-algorithm
which is lightweight, but fairly complete and not seriously incorrect in any way. It does, however, omit any consideration of galloping, which is the trickiest part of the algorithm.
It is important to realize python
is a dynamic language so dumbly implementing timsort will produce something that uses large excesses of memory due to internal object allocation. Timsort requires:
Sorting in-place in python
implies indexing the data list manually. If you use slices, you allocate and dispose of memory each time.
There are three python
implementations on the web that are worth looking at for guidance that I am aware of:
1 https://github.com/reingart/pypy/blob/master/rpython/rlib/listsort.py
2 https://gist.github.com/ruminations/89a045dc0ef7edfb92304a0de0752ee0
3 https://github.com/hu-ng/timsort
The first is part of the pypy
trunk and is implemented in rpython
. It appears to be an adaptation of the cpython
implementation. rpython
is a restricted subset of python intended for static compilation.
The second is a well tested implementation that is well documented and fairly readable. The last is apparently a university exercise, and appears to be complete and correct, but not well tested.
You can find dozens of other attempts at python implementation of timsort but all I have seen either fail to fulfill basic requirements or are clearly incorrect.
Finally, if you expect someone to help you adapt your code, you should at least link to it, but better to provide it directly, for neither mergesort nor insort are difficult to code.