rust types pattern-matching type-inference match-ergonomics

Weird type when pattern matching references

I encountered this strange behaviour when reading this post, and the core question of this post is when you matching (&k, &v) = &(&String, &String) , k and v will get the type String .

To figure out what's happending, I wote the following test code, and the result is much more shocking and confusing to me:

Playground Link

fn main() {
    let x: &(&String, &String) = &(&String::new(), &String::new());
    let ref_to_x: &&(&String, &String) = &x;
    let ref_ref_to_x: &&&(&String, &String) = &&x;
    let ref_ref_ref_to_x: &&&&(&String, &String) = &&&x;
    
    // code snippet 1
    let (a, b) = x;                // type of a: &&String, type of b: &&String
    let (a, b) = ref_to_x;         // type of a: &&String, type of b: &&String
    let (a, b) = ref_ref_to_x;     // type of a: &&String, type of b: &&String
    let (a, b) = ref_ref_ref_to_x; // type of a: &&String, type of b: &&String

    // code snippet 2
    let &(a, b) = x;                // type of a: &String, type of b: &String
    let &(a, b) = ref_to_x;        // type of a: &&String, type of b: &&String
    let &(a, b) = ref_ref_to_x;    // type of a: &&String, type of b: &&String
    let &(a, b) = ref_ref_ref_to_x;// type of a: &&String, type of b: &&String

    // code snippet 3
    let (&a, &b) = x;               // type of a: String, type of b: String
    let (&a, &b) = ref_to_x;        // type of a: String, type of b: String
    let (&a, &b) = ref_ref_to_x;    // type of a: String, type of b: String
    let (&a, &b) = ref_ref_ref_to_x;// type of a: String, type of b: String
}

The type annotation of a and b I added to the end of line is inferred by rust-analyzer.

Please NOTE, code snippet 3 won't compile due to the error can not move out of xx because it's borrowrd/can not move out of xx which is behind a shared reference, but I think this doesn't matter (Perhaps I am wrong here, if so, point me out, thanks) cause we are concentrating on the type of a and b.

My questions are:

why do a and b always have the same type even when the RHS have different types(x/ref_to_x/ref_ref_to_x..) in code snippet 1/2/3?
how does this matching happen(a step-by-step matching process would be appreciated) ?
How can I get the exactly same type inference as rust-analyzer/rustc when writing code?

BTW, is this relevant to the rfc 2005 match-ergonomics? I googled a lot and found many people mentioned this in their answer.

Solution

Yes. What you see is match ergonomics in action, and their behavior may not be what you'd expect.

The way match ergonomics work is by using binding modes. There are three binding modes available, and they can be used even without match ergonomics:

Move. This was the default binding mode before match ergonomics were introduced, and it always tries to move (or copy) the value.
ref. This is what you get when you apply the ref operator to a binding (surprisingly), and it adds one reference. For example, in match e { ref r => ... }, r is &e.
ref mut. Similar to ref, but uses a mutable borrow (and specified using the ref mut operator).

The process works as follows: the compiler process the pattern from the outside inwards. The process starts with move as the binding mode.

Each time the compiler needs to match a non-reference pattern (literal, struct, tuple, slice) against a reference, it automatically dereferences the reference and update the binding mode: when & reference is matched against we'll get the ref binding mode, and for &mut references we will get ref if the current binding mode is ref or otherwise ref mut. Then this process repeats until we don't have a reference anymore.

If we're matching against a reference pattern (binding, wildcard, consts of reference types or &/&mut patterns), the default binding mode is reset back to move.

When a variable is being bound, the compiler looks at the current binding mode: for move, it'll match the type as-is. For ref and ref mut, it will add & or &mut, respectively. But only one.

Let's follow your examples line-by-line.

let (a, b) = x;                // type of a: &&String, type of b: &&String

We match a non-reference pattern (a tuple pattern) against a reference (of type &(&String, &String)). So we dereference the reference and set the binding mode to ref.

Now we got a tuple pattern to match against a tuple of type (&String, &String) and a binding mode of ref. We match a against &String: it's a reference pattern (binding), and so we don't change the binding mode. However, we already have a binding mode of ref. The type we match against is &String, and ref means we add a reference, so we end with &&String. Exactly the same thing happens to b.

let (a, b) = ref_to_x;         // type of a: &&String, type of b: &&String

Here, just like in the previous example, we match a non-reference pattern (a tuple pattern) against a reference (&&(&String, &String)). So we dereference and set the binding mode to ref. But we still have a reference: &(&String, &String). So we dereference again. The binding mode is already ref, so we don't need to touch it. We end with matching (a, b) against (&String, &String). This means a = &String, b = &String. But remember we're using the ref binding mode, so we should add a reference. We add only one reference, even though we matched against two! At the end, we have a = &&String, b = &&String.

The remaining examples in this code snippet work the same way.

let &(a, b) = ref_to_x;        // type of a: &&String, type of b: &&String

Here, we first match the & pattern against the reference of type &&(&String, &String). This removes both references, causing us to match (a, b) against &(&String, &String). From now on we continue just like in the first example.

The remaining examples in this snippet are similar.

let (&a, &b) = x;               // type of a: String, type of b: String

This is the most interesting one. Remember how we talked about reference vs. non-reference patterns? In this example that fact plays a cruical role.

First we match the tuple pattern against the type &(&String, &String). We dereference the tuple and set binding_mode = ref. Now we match the tuple: we got to match &a and &b each against &String, with the binding mode set to ref.

What happens when we match &a against &String? Well, remember & is a reference pattern, and when matching reference patterns we completely ignore the binding mode. So we match &a against &String, with the binding mode reset to move. This removes the reference from both sides, leaving us with a = String. Same for &b.

The next examples in this code snippet are the same.