Search code examples
pythonbinary-search-tree

Deletion in BST (python) | unexpected additional deletions?


def delete(node, key):
  if not node: return None 

  # Wrong node, search correct child
  if key < node.data:
    delete(node.left, key)
  elif key > node.data:
    delete(node.right, key)

  # Correct node found
  else: 
    #1. node has no children 
    if not (node.left and node.right): return None
    #2. node has only left child 
    if node.left and not node.right: return node.left
    #3. node has only right child
    if not node.left and node.right: return node.right
    
    #4. node has both left & right children
      ## Need to replace current value with next biggest value
      ## So go right once then all left to end
      ## Once this value is found, assign to appropriate position
      ## Then remove this val from its previous position 
    temp = node.right 
    while temp.left: temp = temp.left 
    node.data = temp.data 
    node.right = delete(node.right, temp.data) 


t = BinaryTree([100, 50, 200, 25, 75, 350])
delete(t.root, 100)

I think that this BST deletion code mostly works, but it's a little buggy. If I delete the root node, 100, then 350 will be missing, following, given the BST, t = BinaryTree([100, 50, 200, 25, 75, 350]).

What is going on here? I'm not sure why 350 has been deleted in the process. I'm wondering if it's related to how I replace the node value upon successful deletion.

Optional but possibly helpful context

class BinaryTreeNode:
    def __init__(self, data):
        self.data = data
        self.left = None
        self.right = None

class BinaryTree:
    def __init__(self, *args):
        if len(args) < 1:
            self.root = None
        elif isinstance(args[0], int):
            self.root = BinaryTreeNode(args[0])
        else:
            self.root = None
            for x in args[0]:
                self.insert(x)

   def insert(self, node_data):
        new_node = BinaryTreeNode(node_data)
        if not self.root:
            self.root = new_node
        else:
          # root has no parent, so assign none for 1st iteration 
            parent = None
            temp_pointer = self.root
            while temp_pointer:
              # update parent
                parent = temp_pointer
                
                #update temp_pointer to left or right child
                if node_data <= temp_pointer.data:
                    temp_pointer = temp_pointer.left
                else:
                    temp_pointer = temp_pointer.right
            # eventually, temp_pointer will point to None, exiting while loop        
            # assign to left or right child as appropriate
            if node_data <= parent.data:
                parent.left = new_node
            else:
                parent.right = new_node

Solution

  • There are a few issues:

    • As delete is designed to return a node reference or None, you should make sure not to ignore that returned reference. You did it right near the end of your function (node.right = delete(node.right, temp.data)), but elsewhere delete is called without regards of the returned reference. So:

      • The initial call in the main program should look like this:

        t.root = delete(t.root, 100)
        

        This will ensure that the root attribute is set to None when the last node has been deleted from the tree.

      • The recursive call in the first if block should be:

        node.left = delete(node.left, key)
        
      • And similarly in the second block:

        node.right = delete(node.right, key)
        
    • The function delete should always return a node reference after a recursive call has been made, yet this is missing in many of your cases, so add at the very bottom of your function a kind of "catch all" and return the current reference you have:

      return node
      
    • The condition for identifying a leaf node is wrong. The and should be a or:

      if not (node.left or node.right): return None
      

    The corrected code -- comments indicate changes:

    def delete(node, key):
      if not node: return None 
    
      if key < node.data:
        node.left = delete(node.left, key)  # assign back!
      elif key > node.data:
        node.right = delete(node.right, key)  # assign back!
    
      else: 
        if not (node.left or node.right): return None  # condition corrected
        if node.left and not node.right: return node.left
        if not node.left and node.right: return node.right
        temp = node.right 
        while temp.left: temp = temp.left 
        node.data = temp.data 
        node.right = delete(node.right, temp.data) 
      return node  # always return a node when a recursive call was made
    
    t = BinaryTree([100, 50, 200, 150, 175, 25, 75, 350])
    t.root = delete(t.root, 350)  # assign back!
    

    Considerations

    Not a problem in the algorithm, but it is a good habit to put the body of an if or while statement on the next line, indented

    This function would better be a method on the BinaryTree class -- then the main program should not have to worry about getting/setting the root attribute -- and most of the function's (recursive) logic could be implemented as a method on the BinaryTreeNode class.