Search code examples
javabinary-search-tree

Java: How to return the node the breaks a binary search tree?


Working on a method that is supposed to return the node that breaks a binary search tree, or null if none of them do. A few of the test cases pass but some of them are failing and I'm not sure why. Here's my code so far:

public static Node checkBSTValidity(Node rootNode) {
    // Your code here (remove the placeholder line below)
    if (rootNode == null) {
       return null;
    }
    
    // Check if left child node is greater than root node
    if (rootNode.left != null && maxValue(rootNode.left) >= rootNode.key) {
       Node node = new Node(maxValue(rootNode.left));
       return node;
    }
    
    // Check if right child node is less than root node
    if (rootNode.right != null && minValue(rootNode.right) <= rootNode.key) {
       Node node = new Node(minValue(rootNode.right));
       return node;
    }
    
    
    Node leftResult = checkBSTValidity(rootNode.left);
    
    if (leftResult != null) {
       return leftResult;
    }
    
    return checkBSTValidity(rootNode.right);
    
}

After submitting the code above, it says: Invalid tree with left/right child linking to ancestor, and my program produces no output. It also says: Invalid tree with left child's key greater than parent's key doesn't return the correct node. Any help would be appreciated!

Edit: For the challenge it says to implement checkBSTValidity() that takes the tree's root node as a parameter and returns the node that is violating tree. A violating node will be one of three things:

  1. A node in the left subtree of an ancestor with a lesser key
  2. A node in the right subtree of an ancestor with a greater key
  3. A node with the left or right field referencing an ancestor

Solution

  • The main issue is that your function is creating a new Node instance and returns that, while the task is to return a node that already exists in the tree.

    But in comments you also mention that the function should check whether a child node is the same node reference as one of its ancestors. There is no provision for that in your code. I would suggest to use a HashSet for that purpose, to which you can add a node each time you recur to one of the sub trees.

    Finally, it is not efficient to call minValue and maxValue on every node, as this way you'll visit nodes multiple times, giving the whole process a time complexity of O(𝑛log𝑛), while this can be done with O(𝑛) time complexity.

    One idea is to pass to the recursive call a "window" that must contain all values in the subtree. Initially this window is unbounded. If the data type for the keys is int, we could use Integer.MIN_VALUE and Integer.MAX_VALUE as boundaries of the window.

    Here is how it could be coded:

    private static Node checkBSTValidity(Node rootNode, int low, int high,
                                         HashSet<Node> ancestors) {
        if (rootNode == null) {
            return null;
        }
        // Track each ancestor as we traverse down the tree
        ancestors.add(rootNode);
        // Any violation?
        if (rootNode.key < low || rootNode.key > high 
                || ancestors.contains(rootNode.left) 
                || ancestors.contains(rootNode.right)) {
            return rootNode;
        }
        // Check the sub trees
        Node node = checkBSTValidity(rootNode.left, low, rootNode.key, ancestors);
        if (node == null) {
            node = checkBSTValidity(rootNode.right, rootNode.key, high, ancestors);
        }
        ancestors.remove(rootNode); // Backtrack
        return node;
    }
    
    public static Node checkBSTValidity(Node rootNode) {
        HashSet<Node> ancestors = new HashSet<>();
        return checkBSTValidity(rootNode, Integer.MIN_VALUE, Integer.MAX_VALUE,
                                ancestors);
    }
    

    To collect all ancestors in a set may seem overkill, but it is needed to track all of them in case duplicate keys are allowed in the tree. We could have a path down the tree with nodes that all have the same keys, and where a child is a reference to one of those ancestors. In that case the key would not violate the BST property, but the back reference should still be detected.