Search code examples
pythonsortinglinked-listtuplesmergesort

Sort Tuples in a LinkedList formed using class constructor without using built-in functions - Python


I'm looking for help getting started with an assigned LinkedList sorting task. I have been given the starter code below and asked to come up with a suitable sorting method - I'm thinking MergeSort might work best? - but most examples I see online add to the LinkedList using an addNode function alongside a separate Node class so I'm unsure how to proceed with the code below as it specifies I should use data[0][0] for self.name and data[0][1] for self.quantity which is where I am struggling to extract the values for.

class LinkedList:
    def __init__(self, data):
        self.name = data[0][0]
        self.quantity = data[0][1]
        self.tail = None if (len(data) == 1) else LinkedList(data[1:])

# Shortened list to improve readability
fruits = LinkedList([("Apples",7),("Bananas",2),("Dragon Fruit",1),("Pomelo",14),("Grapes",65),("Cherries",43),("Pears",6),("Mangoes",31)])

I tried to iterate through the LinkedList but the solutions I tried involved converting to a list or use built in Python sorting functions (both of which I am not allowed to use for the task).

Additionally, I tried using a 'for in' loop as shown below, creating a self.result array and then appending resultName and resultQuantity as a tuple which I think is a partial solution but I haven't been able to find a way to solve this using the provided self.name (data[0][0]) and self.quantity (data[0][1]) attributes.

# Added to LinkedList class __init__
self.data = data
self.result = []
        
        for each_tuple in self.data:
            resultName = each_tuple[0]
            resultQuantity = each_tuple[1]
            self.result.append((resultName, resultQuantity))

print(fruits.result)

I assume there is probably an easy fix for this but I'm unsure how to proceed without changing the starting code.

Thanks for any help!


Solution

  • I'm thinking MergeSort might work best?

    Yes, that is a good candidate.

    most examples I see online add to the LinkedList using an addNode function alongside a separate Node class

    You don't need an addNode function, since the __init__ function already deals with adding nodes. Your work will be to rewire the existing nodes by altering their tail references.

    The exercise only gives you one class, while other implementations use two classes: one for node instances, and one for the linked list instance having a reference to the head node of the linked list. Here however, the single class corresponds to what other implementations call the Node class -- despite the name LinkedList. There is no container class here that holds the head node reference. This you have to manage "yourself". But this is not a major problem.

    the solutions I tried involved converting to a list or use built in Python sorting functions

    You shouldn't have to add code to the __init__ method. The idea is that you create another method that will do the sorting.

    This method should not create a list like self.result = [], as then you just move the problem of sorting to the standard list, and you are not allowed to use one anyway let be you could call sort or sorted on it. Moreover, the end result needs to be a sorted linked list, so you'd still have to copy the sorted list data back into the linked list. The idea of this exercise is that you sort the linked list without this extra data structure.

    there is probably an easy fix for this

    No, there isn't, as you should not use any of that code that creates and iterates a standard list. So that code needs to be discarded.

    You can ease your work by first creating an __iter__ method on your class. This is not required, but it makes working on this exercise a bit easier, as this can be used to print the list or convert it to a standard list for debugging purposes. Related to that, we have a node content here that has two components: name and quantity. As we want to sort by quantity, it seems a good idea to have a method that returns this data as a tuple: tuples can be compared while sorting. This also gives a way to solve "ties" -- when two nodes have the same quantity we maybe want to use the name to define which one comes first.

    Here are two methods I would add to the class:

        def tuple(self):  # To get the content of a node as a tuple, easing comparisons
            return self.quantity, self.name
        
        def __iter__(self):  # This helps to visualise the linked list
            yield self.tuple()
            if self.tail:
                yield from self.tail
    

    With this in place, you can do things like:

    fruits = LinkedList([("Apples",7),("Bananas",2),("Dragon Fruit",1),("Pomelo",14),("Grapes",65),("Cherries",43),("Pears",6),("Mangoes",31)])
    print(*fruits)
    

    ...and it will print the entries in the linked list (with quantity as first tuple member). This is very useful when debugging.

    Merge Sort

    As you suggested, let's go for merge sort. This is a recursive algorithm, and so you need a base case. The base case is when the list only has one element, i.e. when its tail attribute is None. In that case the list is already sorted and can be returned without any further action.

    In the recursive case you need to:

    • Split the list into two halves (partition):
      • The first half would start at the same node as before the split, so the only information we need back from this split is the reference to the first node of the second half.
    • Sort each of the halves recursively
    • Merge the two sorted halves

    Let's make the partition method. We need to find a middle node, and for this we can use the tortoise and hare approach:

        def partition(self):
            # Use Tortoise and Hare approach to find (almost) middle node
            slow = self
            fast = self.tail  # Start one step ahead
            while fast and fast.tail:
                fast = fast.tail.tail
                slow = slow.tail
            head2 = slow.tail  # Second half starts after the middle node
            slow.tail = None  # Disconnect the first half of the list from the second half
            return head2  # Return start of second half
    

    Then we need a merge method that gets a second sorted linked list as argument and then merges the nodes of both linked lists into one linked list. We can take first order the two lists by their head nodes, take the first and then merge the rest of that first list with the second until one of the lists has no more nodes:

        @classmethod
        def merge(Cls, head1, head2):
            if not (head1 and head2):  # Base case: If at least one of the lists is empty,
                # ...return the other one
                return head1 or head2
            # Order the lists so that head1 has the node that should come first
            if head1.tuple() > head2.tuple():
                head1, head2 = head2, head1
            head1.tail = Cls.merge(head1.tail, head2)  # Merge rest and prepend head1
            return head1
    

    Finally, we need the mergesort algorithm implemented:

        def mergesort(self):
            if not self.tail:  # Base case. When list has just one node, then it is sorted. 
                return self
            # Split list into two, sort the two halves recursively 
            #    and then merge the sorted halves
            return LinkedList.merge(self.partition().mergesort(), self.mergesort())
    

    And that's it. Run as:

    fruits = LinkedList([("Apples",7),("Bananas",2),("Dragon Fruit",1),("Pomelo",14),("Grapes",65),("Cherries",43),("Pears",6),("Mangoes",31)])
    print(*fruits)
    sortedfruits = fruits.mergesort()
    print("sorted:")
    print(*sortedfruits)