How is the match
expression implemented at a high level? What happens under the hood for the compiler to know how to direct certain strains of code to one branch vs. the other, figuring it out at compile time? I don't see how this is possible without storing type information for use at runtime.
Something like this example:
fn tree_weight_v1(t: BinaryTree) -> i32 {
match t {
BinaryTree::Leaf(payload) => payload,
BinaryTree::Node(left, payload, right) => {
tree_weight_v1(*left) + payload + tree_weight_v1(*right)
}
}
}
/// Returns tree that Looks like:
///
/// +----(4)---+
/// | |
/// +-(2)-+ [5]
/// | |
/// [1] [3]
///
fn sample_tree() -> BinaryTree {
let l1 = Box::new(BinaryTree::Leaf(1));
let l3 = Box::new(BinaryTree::Leaf(3));
let n2 = Box::new(BinaryTree::Node(l1, 2, l3));
let l5 = Box::new(BinaryTree::Leaf(5));
BinaryTree::Node(n2, 4, l5)
}
#[test]
fn tree_demo_1() {
let tree = sample_tree();
assert_eq!(tree_weight_v1(tree), (1 + 2 + 3) + 4 + 5);
}
By runtime type information, I mean you literally store a pointer to the type definition or something like that, so under the hood when the runtime execution reaches the match
, it simply checks value.type == given_pattern_expression
or some variant of that. From my understanding, this is not the case. That is, Rust doesn't store the type with the structs/records in the runtime. From my understanding, somehow it is computed at compile time.
I could probably implement this sort of feature if I stored the .type
on each struct/record. Then you just simply look up the type and see if it matches. However, I cannot see a way to implement this at the compiler level, so it figures out matching in advance of runtime.
I think it happens at compile time because of posts like Is it possible to do a compile time type check on a generic in Rust?, or Runtime trait implementation checking?, amongst many other posts which seems to suggest everything happens at compile time, except for a tiny few cases where you can opt-into runtime checking. (this is my understanding from summarizing dozens of articles the past few days). Or another quote is:
One does need to check whether a given BinaryTree is a Leaf or is a Node, but the compiler statically ensures such checks are done: you cannot accidentally interpret the data of a Leaf as if it were a Node, nor vice versa.
That to me says the compiler statically figures out the pattern matching in the BinaryTree
case I posted above. Somehow.
Integrating Pattern Matching with Type Checking is a random post about programming languages that agrees with my understanding, that you need runtime type information to accomplish this. So I am confused.
If types are patterns, you need to be able to determine the type of a value at runtime. In general, this requires runtime type-information (which many languages intentionally erase); but it also requires some supertype that contains the two cases (which many languages intentionally do not have).
Building a runtime reflection system for Rust 🦀️ (Part 1) also explains how Rust doesn't have runtime reflection, which aids in my thinking that everything happens at compile time.
A match
expression does not need runtime type information; as a match
only accepts a single expression of a single known type, by definition it can leverage compile time information.
See also:
match
at compile time vs runtimeAt compile time, every match
expression will be verified to be exhaustive: all possible shapes of the value are handled.
At run time, the code will determine which specific match arm is executed. You can think of a match
as implemented via a fancy if
-else
chain.
As we humans tend to be not-extremely-precise when communicating, it's highly likely that some resources blur the line between these two aspects.
Enum variants are not standalone types. That is, given an enum Foo
, Foo::Bar
is not a type — it's a value of type Foo
. This is the same as how false
is not a type — it's a value of type bool
. The same logic applies for i32
(type) and 42
(value).
In most cases, enums are implemented as a sea of bytes that correspond to the values each enum variant might be, with each variant's data layered on top of each other. This is known as a union.
Then a discriminant is added next to this soup of bytes. This is an integer value that specifies which variant the value is. Adding the discriminant makes it into a tagged union.
Matching on an enum is conceptually similar to this pseudo-Rust:
if discriminant(&enum_value) == VARIANT_1_DISCR {
let data = mem::transmute(data(&enum_value));
} else if discriminant(&enum_value) == VARIANT_2_DISCR {
let data = mem::transmute(data(&enum_value));
}
See also:
how Rust doesn't have runtime reflection
I wouldn't agree that it doesn't have it, but it certainly doesn't have it by default. Runtime type information in Rust will usually leverage the Any
trait:
fn thing(value: &dyn std::any::Any) {
if let Some(s) = value.downcast_ref::<String>() {
dbg!(s.len());
} else if let Some(i) = value.downcast_ref::<i32>() {
dbg!(i + 1);
}
}
The absence of the Any
trait when matching is an additional hint that no runtime type information is present.
you need to be able to determine the type of a value at runtime
This is only true if you allow a value to be multiple types — Rust does not as it's a statically-typed language. In a dynamically-typed language where you want to match a value using a type, you would indeed need to have some amount of runtime type information.