Search code examples
functional-programmingsml

Nested Pattern Matching - SML - Stepping Through an Example and Very Confused


I'm trying to understand pattern matching and recursion and how the program is executing.

In a simple case:

fun sum_list_num xs =
  case xs of
      [] => NONE 
    | x::xs' => case (sum_list_num xs') of
                    NONE => (print "x: "; print (Int.toString x); 
                             print "\n"; SOME x)
                  | SOME y => (print "x: "; print (Int.toString x);
                               print " y: "; print (Int.toString y); print"\n";
                               SOME (x+y))

I can see that the y1 in SOME y is basically acting as an accumulator as the above is returning in sum_list_num [1,2,3,4,5];:

x: 5
x: 4 y: 5
x: 3 y: 9
x: 2 y: 12
x: 1 y: 14
val it = SOME 15 : int option

Am I right to be thinking that or have I missed the mark?.

Now for a more advanced example I'm trying to wrap my head around pattern matching and I've come across this post but I can't make heads or tails of how it works. I've modified the function to print out what's going on at each 'match':

fun smatch (str, in_stringlist) = 
  case in_stringlist of
      [] => NONE
    | x::x' => case(same_string(x, str), smatch (str, x')) of
                   (true, _) => (print "MATCH CASE x: "; print x; 
                                  print " x': "; print(gather(x'));
                                 print "\n"; SOME x')
                 | (false, NONE) => (print "NONE CASE x: ";print x;
                                     print "\n"; NONE)
                 | (false, SOME z) => (print "SOME Z CASE x: "; print x;
                                       print " z: ";print(gather(z));
                                       print " x': ";print(gather(x'));print "\n";  
                                       SOME (x :: z))

The output I was totally not expecting when called using smatch("this", ["1", "2", "this", "4", "5", "6"])

NONE CASE x: 6
NONE CASE x: 5
NONE CASE x: 4
MATCH CASE x: this x': 4 5 6
SOME Z CASE x: 2 z: 4 5 6 x': this 4 5 6
SOME Z CASE x: 1 z: 2 4 5 6 x': 2 this 4 5 6
val it = SOME ["1","2","4","5","6"] : string list option

Where I'm lost:

  • Why am I returning x' in the match case?
  • Why is everything in the list after the string I'm looking for evaluate to the NONE case?
  • I've been sitting here with a pad of paper and many edits and I'm having a tough time 'following' what exactly is happening in how the recursive calls are being stacked up and evaluated.

Any help would be appreciated. Reading SML guides on pattern matching all day have not made this any clearer!

note: the second example IS a homework question to a Coursera class (I answered it using a let function that used an accumulator to build up the list rather than a nested case statement, I came across the above while googling pattern matching SML and have been trying to wrap my head around it all day).

FYI gather is just a helper function so I can print out the contents of a list of strings:

fun gather xs = 
  foldl (fn (x,acc) =>
            acc  ^ " " ^ x) (hd xs) (tl xs)

Solution

  • I can see that the y1 in SOME y is basically acting as an accumulator

    Am I right to be thinking that or have I missed the mark?

    Yes, you are right.

    The sum_list_num function calls itself recursively and returns the accumulated result to itself. Since the result is wrapped in SOME/NONE, pattern matching is used to unpack the result and pack it back into SOME again. E.g. SOME y becomes SOME (x+y).

    Returning int option seems a little superfluous, since the sum of an empty list is well-defined:

    fun sum_list_num [] = 0
      | sum_list_num (x::xs) = x + sum_list_num xs
    

    Or

    fun sum_list_num xs = foldl op+ 0 xs
    

    But for some problems you do want to pattern match on a resulting SOME/NONE and pack the result back in. In those cases you may also want to consider using the library functions Option.map and Option.getOpt. For example:

    fun maximum [] = NONE
      | maximum [x] = SOME x
      | maximum (x::xs) = Option.map (fn y => Int.max (x, y)) (maximum xs)
    

    Moving on,

    fun smatch (str, in_stringlist) = 
      case in_stringlist of
          [] => NONE
        | x::x' => case(same_string(x, str), smatch (str, x')) of
                       (true, _) => (print "MATCH CASE x: "; print x; 
                                      print " x': "; print(gather(x'));
                                     print "\n"; SOME x')
                     | (false, NONE) => (print "NONE CASE x: ";print x;
                                         print "\n"; NONE)
                     | (false, SOME z) => (print "SOME Z CASE x: "; print x;
                                           print " z: ";print(gather(z));
                                           print " x': ";print(gather(x'));print "\n";  
                                           SOME (x :: z))
    

    Why am I returning x' in the match case?

    A better question might be: What is this function doing? Since then a part of that answer could be "... and return the rest of the list (x')." One part of this code that makes it hard to read, I think, is that smatch (str, x') is evaluated in each loop iteration, before the case body, regardless of whether its return value is used or not.

    By handling the tail of the input list first, it is handled backwards (either by design or by accident).

    Why is everything in the list after the string I'm looking for evaluate to the NONE case?

    This is perhaps best shown by evaluating the function by hand for a small input. Every time I write a ~>, I've evaluated the next term (an input to a function before it's run, or a function call if all its inputs are evaluated. The first few term rewrites are:

    smatch ("2", ["1", "2", "3"]) ~>
      case (same_string("2", "1"), smatch ("2", ["2", "3"])) of ... ~>
      case (false, smatch ("2", ["2", "3"])) of ... ~>
      case (false, case (same_string("2", "2"), smatch ("2", ["3"])) of ...) of ... ~>
      case (false, case (true, case (same_string ("2", "3"), smatch ("2", [])) of ...) of ...) of ... ~>
      case (false, case (true, case (false, smatch ("2", [])) of ...) of ...) of ... ~>
      case (false, case (true, case (false, NONE) of ...) of ...) of ...
    

    At this point, the list has been fully traversed, the last element has been determined not to be "2", and the first case of a case-of statement where the pair it's matching ((false, NONE)) is actually being compared. We continue...

      case (false, case (true, (print "NONE CASE x: ";print x; print "\n"; NONE)) of ...) of ... ~>
      case (false, case (true, NONE) of ...) of ... ~>
      case (false, (print "MATCH CASE x: "; print x;  print " x': "; print(gather(x')); print "\n"; SOME ["3"])) of ... ~>
      case (false, SOME ["3"]) of ... ~>
    

    At this point, the (false, NONE) pattern case was matched against first, then the (true, _) pattern, and eventually the (false, SOME z) pattern:

    (print "SOME Z CASE x: "; print x; print " z: "; print(gather(z)); print " x': "; print(gather(x')); print "\n"; SOME ("1" :: "3")) ~>
    SOME ["1", "3"]
    

    I've been sitting here with a pad of paper and many edits and I'm having a tough time 'following' what exactly is happening in how the recursive calls are being stacked up and evaluated.

    Besides evaluating your function by hand (which is difficult when your term rewrite results in a huge case-of), you could structure your code differently so that it is more apparent when you evaluate the tail of your list before the actual elements. There doesn't really seem to be a need for nested pattern matching, either. And since this function really just filters out the first instance of str, why not write it like:

    fun smatch (str, []) = []
      | smatch (str, str2::strs) =
        if str = str2
        then strs
        else str2::smatch(str, strs)
    

    This function is much easier to evaluate by hand:

    smatch ("2", ["1", "2", "3"]) ~>
      if "2" = "1" then ... else "1"::smatch("2", ["2", "3"]) ~>
      "1"::smatch("2", ["2", "3"]) ~>
      "1"::(if "2" = "2" then ["3"] else ...) ~>
      "1"::["3"]
      ["1", "3"]
    

    By the way, your gather function already exists in the standard library as String.concatWith " ".