Search code examples
cbinary-search-treenodes

Deleting a node with 2 children from binary search tree


I'm working on a C program that scans and stores students' records in the following format into a binary search tree. For the BST I've defined:

struct student
   int id
   char firstname[20]
   char lastname[20]
   float score
   char zipcode[10]

struct node
   struct student data
   struct node* leftChild
   struct node* rightChild

One of the functions of the program should be able to delete records from the tree. The program asks the user for the last name of the student that will be deleted.

The following is the method to traverse the BST to find the target last name to be deleted:

void traverse(node* root, student data)
   if(root != NULL)
      traverse(root->leftChild,data)
      if(strcmp(root->data.lastname,data.lastname) == 0)
         root = delete(root,data)
      traverse(root->rightChild,data)

Sample student structure passed to traverse function is:

struct student John
   int id = 1000
   char firstname = John
   char lastname = Adams
   float score = 90.00
   char zipcode = 92121

The delete function I'm using along with the findmin because I'm aware that I need to find the minimum in the right sub-tree of the root (target node) in order to replace the target node with it:

struct node* findmin(struct node* root)
    while(root->leftChild != NULL)
      root = root->leftChild
    return root

struct node* delete(node* root, student data)
  if(root == NULL) // check if empty tree
    return root

  // traverse towards left if ID is less
  else if(data.iD < root->data.iD)
    root->leftChild = delete(root->leftChild,data)

  // towards right if greater than ID
  else if(data.iD > root->data.iD)
    root->rightChild = delete(root->rightChild,data)

  /* else would mean target last name found */
  else
    /* if found node has no child */
    if(root->leftChild == NULL && root->rightChild == NULL)
      free(root)
      root = NULL

    // 1 child
    else if(root->leftChild == NULL) // if left child is NULL
      struct node* temp = root
      root = root->rightChild
      free(temp)
      temp = NULL

    else if(root->rightChild == NULL) // if right child is NULL
      struct node* temp = root
      root = root->leftChild
      free(temp) 
      temp = NULL

    // 2 children
    else
      struct node* temp = findmin(root->rightChild)
      root->data = temp->data
      root->rightChild = delete(root->rightChild,temp->data)

  return root;

What I would expect to happen here is that student John will be passed into traverse along with the global root of BST and when printing the BST, student John will be deleted and no longer there. However, when I pass student containing a target last name, the name will still be existing in the tree. Furthermore, when I test my code and pass a record to delete, occasionally, the program will delete not the targeted node but the node with the greatest student ID (the node furthest right in the tree) and/or will greet me with a segmentation fault 11. Is there a flaw in the code I am not seeing? Let me know if you need me to provide any further information to help you help me.


Solution

  • There is a first problem with the traverse function. When you delete a node, you need to update the pointer to it. You don't keep track of it.

    Here are the traverse and delete functions using the pointer to node pointer strategy. As you see the code becomes simpler.

    In this code pRoot is a pointer to the root pointer. You can then update it's value. You can set it to NULL or to another node due to the deletion.

    void traverse(node** pRoot, student data){
       if(*pRoot != NULL) {
          traverse(&((*pRoot)->leftChild),data);
          if(strcmp((*pRoot)->data.lastname,data.lastname) == 0)
              delete(pRoot,data);
          traverse(&((*pRoot)->rightChild),data);
    }
    
    
    void delete(node* pRoot, student data)
        if(*pRoot == NULL)
            return;
    
      node *temp = *pRoot;
      if((*pRoot)->leftChild == NULL)
          *pRoot = (*pRoot)->rightChild;
      else if((*pRoot)->rightChild == NULL){
          *pRoot = (*pRoot)->leftChild;
      else {
          // both children are not NULL :
          // we replace *pRoot with the node with 
          // min ID detached from the rightChild.
    
          // find node with min ID in right child
          node **pNode = &((*pRoot)->rightChild);
          while((*pNode)->leftChild != NULL)
              pNode = &((*pNode)->leftChild);
    
          // *pNode is the node with minID
          // its left child is NULL, 
          // but its right child may not be NULL
    
          // we detach the node with min ID
          node *nNode = *pNode;
          *pNode = nNode->rightChild;
    
          //replace *pRoot with *nNode
          nNode->rightChild = (*pRoot)->rightChild;
          nNode->leftChild = (*pRoot)->leftChild;
          *pRoot = nNode;
      }
      free(temp);
    }
    

    Combining the above traverse and delete functions into one function, you get

    void delete(node** pRoot, student *data){
       if(*pRoot == NULL)
           return;
       delete(&((*pRoot)->leftChild), data);
       detele(&((*pRoot)->rightChild), data);
       if(strcmp((*pRoot)->data.lastname,data->lastname) != 0)
           return;
    
       // we delete *pRoot
    
       node *temp = *pRoot;
       if((*pRoot)->leftChild == NULL)
           *pRoot = (*pRoot)->rightChild;
       else if((*pRoot)->rightChild == NULL){
           *pRoot = (*pRoot)->leftChild;
       else {
           // both children are not NULL :
           // we replace *pRoot with the node with 
           // min ID detached from the rightChild.
    
           // find node with min ID in right child
           node **pNode = &((*pRoot)->rightChild);
           while((*pNode)->leftChild != NULL)
               pNode = &((*pNode)->leftChild);
    
           // *pNode is the node with minID
           // its left child is NULL, 
           // but its right child may not be NULL
    
           // we detach the node with min ID
           node *nNode = *pNode;
           *pNode = nNode->rightChild;
    
           //replace *pRoot with *nNode
           nNode->rightChild = (*pRoot)->rightChild;
           nNode->leftChild = (*pRoot)->leftChild;
           *pRoot = nNode;
       }
       free(temp);
    }