Search code examples
pythonpython-3.xlisttuplesdoubly-linked-list

Build a spreadsheet from a list of tuples of (row, column, value) using doubly linked list (linked lists of linked lists)


I'm hoping to get a result like in Build a spreadsheet from a list of tuples of (row, column, value) using 2D array. no numpy, but instead of 2d lists it is a linked list of linked lists.

Currently I have no idea how to build a linked list of linked lists.

class Cell:
    def __init__(self, row: int, col: int, val: float):
        # a cell object has the row, column and value
        self.row = row
        self.col = col
        self.val = val

    def __str__(self) -> str:
        """
        Convert cell to a string for printing.
        """

        return "(" + str(self.row) + "," + str(self.col) + "," + "{:.2f}".format(self.val) + ")"
class Node:
    """
    A basic type in linked list
    """

    def __init__(self, value=None):
        self.m_value = value
        self.m_next = None
        self.m_prev = None

    def get_value(self):
        return self.m_value

    def get_next(self):
        return self.m_next

    def get_prev(self):
        return self.m_prev

    def set_value(self, value):
        self.m_value = value

    def set_next(self, next):
        self.m_next = next

    def set_prev(self, prev):
        self.m_prev = prev
class LinkedListSpreadsheet(BaseSpreadsheet):
    def __init__(self):
        self.rows = LinkedList()
        self.cols = LinkedList()
        self.length = 0

    def buildSpreadsheet(self, lCells: [Cell]):
        """
        Construct the data structure to store nodes.
        @param lCells: list of cells to be stored
        """
        pass

I need help to implement the SpreadSheet by finishing def buildSpreadsheet(self, lCells: [Cell]) lCell or Cell which has row, column, value.

So lCell = [[9, 9, 20] [2, 5, 7] [3, 1, 6] [8, 5, -6.7], [1, 1, 3]]

The result I'm expecting is a linked list of linked lists that store only the value of Cell in location of row and col.

Like the image at col 1 row 1 the val stored is 3.

expected output


Solution

  • First some remarks:

    • There is no use in creating getters and setters on your Node class, when you allow the caller to set/get any value. In that case just have the caller access the attributes directly.

    • If the aim is to create a Node instance for every row/col combination, even if there is no data in that node, then there really is no benefit in using linked lists, and you'll be better of with nested (standard) lists. Using linked lists might become more interesting when you don't create nodes for where there is no data. This allows to efficiently represent spreadsheets that are very sparse, like when there are only a few values, but sitting in row 825812 and columns 9422 and 16840. So I would suggest to only create nodes for cells with data. You would still store them in the right order.

    • There are many other choices to make about the data structure, but creating separate lists for columns is not a good idea.

    • I would suggest a top-level linked list data structure where the instance inherits from Node: this will be a sentinel (dummy) node. Each new node in this linked list will be node instance that has as value another linked list (nested). This nested linked list will represent a row, and each node in it represents a cell (its value is a Cell instance). Also this nested linked list will be an instance of Node, representing a sentinel node.

    • You could add a method that will turn the linked list structure into the more common list-of-lists structure (with None at unused cells), so you can more easily test the implementation.

    This might not be as you hoped it would be, but I just cannot bring myself of creating a linked list that has a node instance for every coordinate that has no data. It just feels bad:

    class Cell:
        def __init__(self, row: int, col: int, val: float):
            self.row = row
            self.col = col
            self.val = val
    
        def __repr__(self) -> str:
            return f"Cell({self.row},{self.col},{self.val})"
    
    class Node:
        def __init__(self, value=None):
            self.value = value
            self.next = self.prev = None
    
        def insertAfter(self, value):
            other = Node(value)
            other.next = self.next
            if other.next:
                other.next.prev = other
            self.next = other
            other.prev = self
    
        def __str__(self) -> str:
            return f"Node(value:{self.value})"
    
    class LinkedList(Node):
        def __init__(self, sentinelvalue=None):
            super().__init__()
            self.value = sentinelvalue
    
        def __iter__(self):
            node = self.next  # Skip dummy node
            while node:
                yield node.value
                node = node.next
    
        def nodes(self):
            node = self  # Include dummy node
            while node:
                yield node
                node = node.next
        
        def __repr__(self):
            return f"LinkedList={super().__str__()})"
        
    
    class LinkedListRow(LinkedList):
        def __init__(self, cell):
            super().__init__(Cell(cell.row, -1, -1))
            self.set(cell)  # A row always has at least one cell with data
    
        def find(self, col):
            return next(node for node in self.nodes() if not node.next or node.next.value.col > col)
        
        def set(self, cell):
            node = self.find(cell.col)
            if node.value.col == cell.col:
                node.value = cell
            else:
                node.insertAfter(cell)
            
        def __repr__(self):
            return f"Row={super().__str__()})"
        
        def asList(self):
            lst = []
            for value in self:
                lst.extend([None] * (value.col - len(lst)))
                lst.append(value)
            return lst
    
    class LinkedListSpreadsheet(LinkedList):
        def __init__(self):
            super().__init__(LinkedListRow(Cell(-1, -1, -1)))
            
        def find(self, row):
            return next(rowlist for rowlist in self.nodes() if not rowlist.next or rowlist.next.value.value.row > row)
        
        def set(self, cell):
            node = self.find(cell.row)
            if node.value.value.row == cell.row:
                node.value.set(cell)
            else:
                node.insertAfter(LinkedListRow(cell))
    
        def __repr__(self):
            return f"Spreadsheet={super().__str__()})"
    
        def asList(self):
            matrix = []
            for rowlist in self:
                matrix.extend(([] for _ in range(rowlist.value.row - len(matrix))))
                matrix.append(rowlist.asList())
            # pad the inner lists so they have equal length
            maxcols = max(map(len, matrix))
            for row in matrix:
                row.extend([None] * (maxcols - len(row)))
            return matrix
    

    Demo run:

    sheet = LinkedListSpreadsheet()
    sheet.set(Cell(1, 10, 3))
    sheet.set(Cell(1, 3, 9))
    sheet.set(Cell(1, 2, 18))
    sheet.set(Cell(1, 0, -10))
    sheet.set(Cell(4, 5, 55))
    for row in sheet.asList():
        print(row)