I am working on this BST (binary search tree) code challenge:
Write a
BST
class for a Binary Search Tree. The class should support:
- Inserting values with the
insert
method.- Removing values with the
remove
method; this method should only remove the first instance of a given value.- Searching for values with the
contains
method.Note that you can't remove values from a single-node tree. In other words, calling the
remove
method on a single-node tree should simply do nothing.Each
BST
node has an integervalue
, aleft
child node, aright
child node. A node is said to be a validBST
node if and only if it satisfies the BST property; itsvalue
is strictly greater than the values of every node to its left; itsvalue
is less than or equal to the values of every node to its right; and its children nodes are either validBST
nodes themselves orNone
/null
.Sample Usage
// Assume the following BST has already been created: 10 / \ 5 15 / \ / \ 2 5 13 22 / \ 1 14
And I am currently implementing the remove
method. This is my code:
class BST:
def __init__(self, value):
self.value = value
self.left = None
self.right = None
def insert(self, value):
if self is None:
self = BST(value)
if value < self.value:
if self.left is not None:
self.left.insert(value)
else:
self.left = BST(value)
if value >= self.value:
if self.right is not None:
self.right.insert(value)
else:
self.right = BST(value)
return self
def contains(self, value):
if self == None:
return False
if self.value == value:
return True
if value < self.value:
if self.left is not None:
return self.left.contains(value)
if value > self.value:
if self.right is not None:
return self.right.contains(value)
return False
def remove(self, value, parent=None):
if self is None:
return self
elif value < self.value:
if self.left is not None:
self.left.remove(value,self)
elif value > self.value:
if self.right is not None:
self.right.remove(value,self)
elif value == self.value:
if self.right is None and self.left is None:
if parent is None:
pass
else:
if parent.left == self:
parent.left = None
if parent.right == self:
parent.right = None
elif self.right is None:
if parent is None:
self = self.left
else:
if parent.left == self:
parent.left = self.left
if parent.right == self:
parent.right = self.left
elif self.left is None:
if parent is None:
self = self.right
else:
if parent.left == self:
parent.left = self.right
if parent.right == self:
parent.right = self.right
elif self.right is not None and self.left is not None:
sub = self.right.minguy()
self.value = sub
self.right.remove(sub,self)
pass
return self
def minguy(self):
while (self.left is not None):
self = self.left
return self.value
Consider this degenerate tree:
1
\
2
\
3
\
4
...and we have to remove 1. Why does simply self = self.right
not do the removing job?
I was expecting 2 to be the root and 1 to not exist and the tree to end up like:
2
\
3
\
4
Specifically this code:
elif self.left is None:
if parent is None:
self = self.right
else:
if parent.left == self:
parent.left = self.right
if parent.right == self:
parent.right = self.right
Here is the code that tests my implementation:
import program
import unittest
BST = program.BST
class TestProgram(unittest.TestCase):
def test_case_1(self):
root = BST(1)
root.insert(2)
root.insert(3)
root.insert(4)
root.remove(1)
self.assertTrue(not root.contains(1))
self.assertTrue(root.value == 2) # This test fails!
self.assertTrue(root.contains(3))
Why does simply
self = self.right
not do the removing job?
That is indeed the statement that is at the core of the problem. self
is a local name, and thus self =
only assigns something to that local name. Assignments to names can never mutate your tree. To mutate a tree, you need assignments to attributes, not names.
In the special case where you need to remove the root node (and don't have a parent
node whose left
or right
attribute to alter), and where that root has exactly one child, you must realise that you cannot change what the caller's root
name refers to. Whatever you do, the caller's root
will still reference the original root node. So you have to update that root node's attributes.
There are at least two ways to do that:
Either you apply the minguy
logic also when the node has only a right child, and you do the "mirrored" action (i.e. implement maxguy
) for when the node only has a left child, or...
overwrite all the root's attributes (left
, right
and value
) so it becomes a copy of its only child. That child thus becomes disconnected and is therefore deleted instead of the root node.
Here is what you need to change to implement the second idea:
First add a helper method to get a copy from another node:
def copy_from(self, other):
self.value = other.value
self.left = other.left
self.right = other.right
Then replace this:
self = self.left
...with:
self.copy_from(self.left)
And replace:
self = self.right
...with:
self.copy_from(self.right)
This will fix your issue.
You have code that tests whether self
is None
, but this will never happen when your methods are called as methods. I suppose that the testing suite will not call these functions in another way.
elif self.right is not None and self.left is not None:
-- this is the only case that is left over after all previous conditions, so this could be just an else:
When dealing with nodes it is not necessary to explicitly compare with None
. You can just do if self.left:
and if not self.left:
, ...etc.
If you allow remove
to return a node, then you can do it without the parent
parameter, simplifying the code:
def remove(self, value):
if value < self.value:
if self.left:
self.left = self.left.remove(value)
elif value > self.value:
if self.right:
self.right = self.right.remove(value)
elif value == self.value:
if not self.right and not self.left:
return None
elif not self.right or not self.left:
self.copy_from(self.left or self.right)
else:
self.value = self.right.minguy()
self.right = self.right.remove(self.value)
return self
Concerning the code challenge, I must say this is an inferior challenge. It says that "you can't remove values from a single-node tree". That is of course nonsense, and is inspired by a limited vision on how to implement trees. They force you into a certain implementation where a tree must be initialised with one value immediately, and this is because they make no difference between the concept of a node and the concept of a tree. Yet, there is a difference. The notion of an empty tree exists and there is really no good reason why we should not be able to represent an empty tree, or create one for that matter.
The better way to implement a tree is to distinguish the concept of a node and a tree. Both deserve a class. But it wouldn't work with the kind of tests they have prepared, and so they enforce an inferior design.