During two weeks I've been doing some simple programs in OCaml. I've noticed that when we are working with a recursive structure T
and we want to have the information I
on T
then depending on the information I
we have two types of recursive function.
For simplicity let's assume T
is a binary tree. So I'll use the following type :
type 'a tree = Empty | 'a * 'a tree * 'a tree
Now let's say the information I
can be calculated from left to right on the binary tree. When I am saying left to right it means that the information I
can be calculated from the root to the leaves without getting backward.
To be more clear let's say the information I
we want to have is simply "the number of nodes of the binary tree". Then what's nice with this information is that when we get to all leaves then we get I
, so we are going left to right in the sense that we begin from the root and expend recursively to the left and right subtree and the end case is when we arrived at the leaves.
So we simply have :
let rec nodes = function
|Empty -> 0 (*it's ok we are done here*)
|Node(_,l,r) -> 1 + nodes l + nodes r
What's very nice is that when the information can be calculated left to right then OCaml's pattern matching is a very strong tool and the information I
can be calculated in an easy way.
So more generally we have :
let rec get_information = function
| Empty -> (*here we are done so we return a constant value*)
|Node(_,l,r)-> (*here we apply recusrively the function to the left and right tree*)
Now here comes my problem. Let's say I
is an information that can't be calculated from left to right but from right to left. So it means that to get the information I
we need to begin from the leaves of the tree and extend recursively to the top and we are done only when we get to the root of the binary tree (so the end case is when we get to the root of the binary tree and not the leaves).
For example, let's say the information I
is : "the binary tree has the propriety that for every node the number of nodes in his left subtree is strictly superior to the number of nodes in his right subtree". If we want to solve this in linear time then we need to begin from the leaves and expend recursively to the top (note that I don't necessarily want a solution to the problem).
So to me, it's tricky to write a function that gets the information I
when I
is a right to left information (it needs to begin from the leaves and extend to the top). On the contrary pattern-matching is perfect when the information is a left to right information.
So my question is how to do when we need to write a function that gets the information I
(when I
is right to left)? Are there techniques to solve these kind of problems? Is it still possible to use pattern matching in a tricky way in order to get the desired result?
Pattern matching is useful for writing both kinds of function. Higher order functions called folds can also be used.
First, a concrete version. We will want to know whether a tree is left leaning, and if so, how many nodes it has. An int option
will represent this nicely, with None
indicating any non-left leaning tree.
type 'a tree = Empty | Branch of 'a * 'a tree * 'a tree
let rec tree_info = function
| Empty -> Some 0
| Branch (_, l, r) ->
match tree_info l, tree_info r with
| Some x, Some y when x >= y -> Some (x + y + 1)
| _ -> None
let is_left_leaning tree =
match tree_info tree with
| Some _ -> true
| None -> false
(Note that the condition x >= y
is not 'strictly greater than', but this is deliberate; x > y
is a poor choice. I'll leave figuring out why as an exercise.)
We can also express this style of function in terms of an operation called a right fold. For this operation one provides a value for each constructor of the datatype being folded over: in each place that constructor occurs, the fold operation will use that value to compute the result of the fold:
let rec foldr empty branch = function
| Empty -> empty
| Branch (x, l, r) ->
branch x (foldr empty branch l) (foldr empty branch r)
Note that the empty
value and the Empty
constructor have the same arity, and the branch
value and the Branch
constructor have the same arity, with corresponding argument types. That's characteristic of a right fold.
Given foldr
, we can easily define map
:
let map f tree =
foldr Empty (fun x l r -> Branch (f x, l, r)) tree
Or of course, 'tree_info':
let tree_info tree =
foldr
(Some 0)
(fun _ l r ->
match l, r with
| Some x, Some y when x >= y -> Some (x + y + 1)
| _ -> None)
tree
This is the alternative to pattern matching on the constructors of tree
.