Search code examples
compiler-constructionlanguage-design

How is match implemented in a language like Rust?


I'm not a functional programmer. So I'm not very familiar with pattern matching, patterns, or any of that stuff. To me I only understand the concept of the good old switch statement.

How would a compiler implement a match statement? What exactly is the difference between match and a switch? There's a GNU C99 extension that allows you to have ranges in cases of a switch, is there a difference between:

match x {
    0 ... 9 => ...,
    _ => ...,
}

and

switch (x) {
case 0 ... 9: ...; break;
default: ...; break;
}

Note that the second snippet is a simple C switch with this GNU extension.


Solution

  • A pattern match on constant values can be implemented as a jump table or a sequence of conditional jumps - just like switch statements. Allowing ranges does not change this situation much.

    Rust enums (at least the ones with members) are implemented like tagged unions, i.e. structs that contain a tag and a union of structs that contain the members.

    A pattern match on an enum is then simply translated as a switch on its tag (and also binding the variables bound by the pattern to the members of the union). So something like this Rust code:

    enum Result {
      SingleResult(i32),
      TwoResults(i32, i32),
      Error
    }
    
    match someResult {
      Result::SingleResult(res) => f(res),
      Result::TwoResults(res1, res2) => g(res1, res2),
      Result::Error => error()
    }
    

    would translate to the same machine code (presumably) as the following C code:

    struct Result {
      enum {
        SingleResult, TwoResults, Error
      } tag;
      union {
        struct {
          int arg1;
        } singleResult;
        struct {
          int arg1;
          int arg2;
        } twoResults;
      } value;
    };
    
    switch(someResult.tag) {
      case SingleResult: {
        int res = someResult.value.singleResult.arg1;
        f(res);
        break;
      }
      case TwoResults: {
        int res1 = someResult.value.twoResults.arg1;
        int res2 = someResult.value.twoResults.arg2;
        g(res1, res2);
        break;
      }
      case Error: {
        error();
        break;
      }
    }