Search code examples
c++c++11searchbinary-search-tree

counting number of elements less than X in a BST


I had implemented a BST for a multiset using the C++ code below, whereas each node contains the number of occurrence num of each distinct number data, and I try to find the number of elements less than certain value x, using the order function below.

It works, however, inefficient in terms of execution time. Is there any method with better time complexity?

struct Node {
    int data;
    int height;
    int num;

    Node *left;
    Node *right;
};
int order(Node *n, int x) {
    int sum = 0;
    if (n != NULL) {
        if (n->data < x) {
            sum += n->num;
            sum += order(n->right, x);
        }
        sum += order(n->left, x);
    }
    return sum;
}

Solution

  • You can bring the algorithm down to O(logN) time by storing in each node the number of elements in the subtree of which it is the root. Then you'd only have to recurse on one of the two children of each node (go left if x < node->data, right if x > node->data), which if the tree is balanced only takes logarithmic time.

    struct Node {
        int data;
        int height;
        int num;
        int size; // numer of elements in the subtree starting at this node
    
        Node *left;
        Node *right;
    };
    
    int order(Node *n, int x) {
        if(n == NULL) return 0;
    
        // elements less than n->data make up the whole left subtree
        if (x == n->data) {
            return n->left ? n->left->size : 0;
        }
        // even smaller? just recurse left
        else if (x < n->data) {
            return order(n->left, x);
        }
        // bigger? take the whole left subtree and part of the right one
        else {
            return (n->left ? n->left->size : 0) + order(n->right, x);
        }
    }
    

    Of course, now you have to keep track of the size, but this can be done very efficiently when updating the tree: simply recalculate the size (n->left->size + n->right->size + 1) of each modified node in an insertion or deletion.