Search code examples
listrecursionocamlrecord

Search Recursive Record Type OCaml


I am trying to recursively search for a field value in a record that is a recursive record type.

My record type is

type node = {node_name:string; node_branch:node list}

First, I tried to just traverse a tree like variable of this type:

let tree = {node_name="trunk"; node_branch=[{node_name="branch1L:1"; node_branch=[{node_name="branch2L:1"; node_branch=[]};
                                                                                  {node_name="foo_node"; node_branch=[]};];};
                                            {node_name="branch1L:2"; node_branch=[]}];}
    in
    let target = "foo_node" in
    print_newline();
    search_tree_for_name target tree;

The idea, is that it is kind of like this:

                    trunk
            __________|_________
           |                    |
       branch1L:1           branch1L:2
    ______|_______
   |              |
branch2:1       foo_node

So I am searching for a record that has the field value of node_name = foo_node. The way I traversed it, just to prove out my logic was:

let rec search_tree_for_name target_name in_tree =
  if in_tree.node_name = target_name then
    (* First check the node_name *)
      Format.printf "[%s] Target node_name found\n" in_tree.node_name
  else
    (* Then evaluate the branches *)
    begin
      match in_tree.node_branch with
      | [] ->
         Format.printf "[%s] Branches: 0; End of branch\n" in_tree.node_name
      | head :: tail ->
         Format.printf "[%s] Branches: %i\n" in_tree.node_name (List.length in_tree.node_branch);
         search_tree_for_name target_name head;
         List.iter (search_tree_for_name target_name ) tail;
    end

This prints out:

[trunk] Branches: 2
[branch1L:1] Branches: 2
[branch2L:1] Branches: 0; End of branch
[foo_node] Target node_name found
[branch1L:2] Branches: 0; End of branch

Now, what I really want is to just see if an instance exists where the node_name is what I am looking for, so I just want to return a boolean.

What I tried was (I created a new function just for testing purposes):

let rec check_tree_for_name target_name in_tree =
  if in_tree.node_name = target_name then
    begin
    (* First check the node_name *)
      Format.printf "[%s] Target node_name found\n" in_tree.node_name;
      true
    end 
  else
    (* Then evaluate the branches *)
    begin
      match in_tree.node_branch with
      | [] ->
         Format.printf "[%s] Branches: 0; End of branch\n" in_tree.node_name;
         false
      | head :: tail ->
         Format.printf "[%s] Branches: %i\n" in_tree.node_name (List.length in_tree.node_branch);
         check_tree_for_name target_name head;
         List.iter (check_tree_for_name target_name ) tail;
    end

I am getting the following Error on my function I pass to the List.iter:

Error: This expression has type node -> bool
       but an expression was expected of type node -> unit
       Type bool is not compatible with type unit 

Then I changed the last pattern matching block to

  | head :: tail ->
     Format.printf "[%s] Branches: %i\n" in_tree.node_name (List.length in_tree.node_branch);
     if check_tree_for_name target_name head then true
     else List.iter (check_tree_for_name target_name ) tail;

So that the if conditionals checks the return of the function call and kind of passes it on. Now I have one less error but I still have the same error.

My understanding is that List.iter will pass each element of the passed list as the last argument to the function that is passed to List.iter. So this should iterate over the elements of my list that I pass to it and the only ways out are if the node_name field matches what I am looking for or if it is an empty list.

The major difference I see between my two implementations is that the first one keeps going until all nodes are evaluated and the second implementation goes until I find what I'm looking for.

Any suggestions would be appreciated and if possible a short explanation of where I am going wrong or at least highlight where I am going wrong (I feel like I'm picking up a lot of functional programming concepts but there still a few that are eluding me).


Solution

  • Well, in the last branch, if I understand well, you want to know if one of the nodes in tail has the good name. In that case, just use List.exists :

    | head :: tail ->
             Format.printf "[%s] Branches: %i\n" in_tree.node_name 
                (List.length in_tree.node_branch);
             check_tree_for_name target_name head ||
             List.exists (check_tree_for_name target_name ) tail
    

    And it will work as intended (note the || between check_tree... and List.exists .... If check_tree ... returns true, you don't need to check the rest of the list.


    Small explanation :

    In OCaml, all the branches of a pattern matching must have the same type. Here, your first branch had a boolean type because you ended it with false and the second branch had the unit type because you ended it with List.iter .... List.iter is of type ('a -> unit) -> 'a list -> unit and applies its first argument to all the elements in the list given as second argument. List.exists is of type ('a -> bool) -> 'a list -> bool and returns true if one element in the list given as the second argument satisfies the predicate given as the first argument and false if no such element exists.

    Equivalently, if cond then e1 else e2 is like a pattern matching so e1 and e2 should have the same type so your second attempt was as wrong as the first one. ;-)