functional-programming ocaml binary-tree tree-search

Performance comparison of binary search tree functions

I have these two binary search tree search functions. My question is which one of these functions are more efficient, or if they are equivalent?

Given

type 'a tree = 
  | Empty
  | Node of 'a * 'a tree * 'a tree

let rec search x t =
  match t with
  | Empty -> Empty
  | Node(a, left, right) as t' ->
      if a = x then t'
      else 
        match search x left with
        | Empty -> search x right
        | t'' -> t''

let rec search x tree = 
  match tree with 
  | Empty -> Empty
  | Node (root, left, right) as t -> 
      if (x = root) then t 
      else if (x < root) then search x left
      else search x right

I think that the second one is equivalent simply because we don't need to have another match with Empty since we already have it before.

Solution

It depends

This really depends on whether values in your binary tree are strictly ordered. Your first function will search the right branch if the value is not found in the left branch. The second will only follow either the left or the right branch, but not both.

They are not equivalent.

Performance and stack considerations

If we assume strict ordering then the first function will be less efficient as it will search all nodes. Performance will be O(n) while with the second function, performance will be O(log n). The first function is also not tail-recursive, so you run the risk of a stack overflow for a very large tree.

If you want to make this tail-recursive, one approach is to maintain a stack or worklist of nodes to deal with and work through them as we go. We search left branches first, pushing the right branches onto the stack. Once we hit an Empty left branch, we start consuming the stack.

let rec search x t stack =
  match t, stack with
  | Empty, [] -> Empty
  | Empty, n::ns -> search x n ns
  | Node (v, _, _), _ when x = v -> t
  | Node (_, l, r), _ -> search x l @@ r :: stack

We can hide the stack argument by making this a local function.

let search x t =
  let rec search' x t stack =
    match t, stack with
    | Empty, [] -> Empty
    | Empty, n::ns -> search' x n ns
    | Node (v, _, _), _ when x = v -> t
    | Node (_, l, r), _ -> search' x l @@ r :: stack
  in
  search' x t []

A further suggestion

If we can assume strict ordering in the tree, I would probably suggest the following, using conditional guards.

let rec search x t =
  match t with
  | Empty -> Empty
  | Node (v, _, _) when x = v -> t
  | Node (v, l, _) when x < v -> search x l
  | Node (_, _, r) -> search x r

Which we can reduce to:

let rec search x = function
  | Empty -> Empty
  | Node (v, _, _) as t when x = v -> t
  | Node (v, l, _) when x < v -> search x l
  | Node (_, _, r) -> search x r