Is matching on patterns faster than if/else in rust?

Let's say you have an enum:

enum Expr {
    binExp,
    unExp, 
    Literal, 
    Group,
}

Should you prefer match or if/else chains to call methods based on Expr?

match self { 
    Expr::binExp  => todo!(),
    Expr::unExp   => todo!(),
    Expr::Literal => todo!(),
    Expr::Group   => todo!()
}

if let Expr::binExp = self {
    todo!()
} else if let Expr::unExp = self {
    todo!()
} else if let Expr::Lit = self {
    todo!()
} else if let Expr::Group(e) = self {
    todo!()
} else {
    todo!()
}

Stated another way, is match a parallel operation? I know if/else will go through each expression sequentially, evaluating them as they are encountered as false and finally make a stop.

Do match statements mimic this behavior or do they directly "jump" to the correct pattern? I ask this specifically because match can match on multiple patterns, but always select the pattern that matches it first. So it does sound like match may be sequential in that regard.

Solution

The compiler aims to produce the same code in each of the two cases, when compiling in release mode. However, the match approach is easier for the compiler to deal with, and in fact there are some examples where the compiler is unable to produce optimal code for the if/else chain.

How this sort of code is supposed to be compiled

The Rust compiler is compiled of two halves. One half, written by the Rust developers, checks that your source code is valid (e.g. by enforcing the borrow checker rules), and then compiles it into LLVM IR; the second half is LLVM, which is a compiler backend shared between multiple different languages, that takes the LLVM IR and produces an executable.

In the case of the code you've written, a) it doesn't matter which order the tests for the various sorts of Expr are written in, and b) it's guaranteed that at least one of them will match. Those are pieces of information that give LLVM a lot of flexibility as to how to express the match in an executable (and at least on the processor I tested, LLVM's choice is to use an array of possible jump locations, and index into them using the discriminant of your enum, i.e. it is actually performing all the tests in parallel).

When you write the match example, the Rust to LLVM IR part of the compilation understands all that information itself. It writes LLVM IR that says "please jump to one of these four locations based on the value of the enum discriminant", using LLVM IR's switch statement which is comparable to Rust's match, and the IR looks something like this (note the "unreachable" in the first line, specifying that the default case of the match is unreachable – the compiler has determined that no options other than these four are possible):

  switch i64 %0, label %default.unreachable [
    i64 0, label %bb3
    i64 1, label %bb4
    i64 2, label %bb5
    i64 3, label %bb2
  ]

With the if/else example, then in earlier stages of compilation, the compiler doesn't understand that doing this with a single test is possible, and it writes out the equivalent of four if statements, each with its own test (even though, technically, only three of the tests are actually necessary because if the first three fail the fourth will always pass). If you compile in debug mode, then the resulting executable actually does each of those tests separately.

What about if you compile in release mode? Well, the compiler has a number of optimizations (running only in release mode) that are designed to detect less efficient code and replace it with more efficient code. In this case, the optimizations notice that the entire chain is simply a check on a single integer (the enum discriminant). So the optimizations end up rewriting the if/else chain into a single switch block, producing almost the same IR as you would get with the match case. In many cases, this means that you get identical code either way.

A case where it does make a difference

Unfortunately, there are some cases where the compiler does fail to handle this situation and produces slightly suboptimal code. Let's say we have an enum with seven options, each of which has no fields, and try the match and if/else chain on it:

pub enum ExampleEnum {
    A, B, C, D, E, F, G
}

pub fn match1(e: ExampleEnum) {
    match e { 
        ExampleEnum::A => panic!("a"),
        ExampleEnum::B => panic!("b"),
        ExampleEnum::C => panic!("c"),
        ExampleEnum::D => panic!("d"),
        ExampleEnum::E => panic!("e"),
        ExampleEnum::F => panic!("f"),
        ExampleEnum::G => panic!("g"),
    }
}

pub fn match2(e: ExampleEnum) {
    if let ExampleEnum::A = e {
        panic!("a")
    } else if let ExampleEnum::B = e {
        panic!("b")
    } else if let ExampleEnum::C = e {
        panic!("c")
    } else if let ExampleEnum::D = e {
        panic!("d")
    } else if let ExampleEnum::E = e {
        panic!("e")
    } else if let ExampleEnum::F = e {
        panic!("f")
    } else if let ExampleEnum::G = e {
        panic!("g")
    } else {
        unreachable!()
    }
}

Playground link

In release mode, we'd expect these to produce the same code. Unfortunately, with this many options, the compiler loses track of the fact that the final else block is unreachable while it's collapsing the whole if/else chain into a single switch, so the LLVM IR looks slightly different in the two cases:

match1 (using match) match2 (using if/else)

`match1` (using `match`)	`match2` (using `if`/`else`)
switch i8 %0, label %bb1 [ i8 0, label %bb3 i8 1, label %bb4 i8 2, label %bb5 i8 3, label %bb6 i8 4, label %bb7 i8 5, label %bb8 i8 6, label %bb2 ] bb1: unreachable	switch i8 %0, label %bb14 [ i8 0, label %bb1 i8 1, label %bb3 i8 2, label %bb5 i8 3, label %bb7 i8 4, label %bb9 i8 5, label %bb11 i8 6, label %bb13 ] bb14: ; call core::panicking::panic tail call void @_ZN4core9panicking5panic…

  switch i8 %0, label %bb1 [
    i8 0, label %bb3
    i8 1, label %bb4
    i8 2, label %bb5
    i8 3, label %bb6
    i8 4, label %bb7
    i8 5, label %bb8
    i8 6, label %bb2
  ]

bb1:
  unreachable

  switch i8 %0, label %bb14 [
    i8 0, label %bb1
    i8 1, label %bb3
    i8 2, label %bb5
    i8 3, label %bb7
    i8 4, label %bb9
    i8 5, label %bb11
    i8 6, label %bb13
  ]

bb14:
; call core::panicking::panic
  tail call void @_ZN4core9panicking5panic…

The compiler has produced a switch block in both cases, but the match specifies the default case as a branch to the LLVM intrinsic unreachable (the equivalent of Rust's unreachable_unchecked!()), whereas with the if/else, the unreachable!() at the end hasn't been proven to be necessarily unreachable, so the default case branches to Rust's unreachable!() macro (which translates into a call to panic()), and because that's all the information that LLVM has, it doesn't optimize the unreachable!() call out of the final program.

Still, having dead code in an executable mostly doesn't matter – if it doesn't run, it isn't taking up much time – but in this seven-element enum case, the resulting executable is going to be marginally slower in the if/else case. Compiling for x86_64 (as a commonly used example), most of the generated assembly code is the same except for the names of line labels, but one small portion near the start of the function differs (I manually changed the whitespace to show the difference):

match1 (using match) match2 (using if/else)

`match1` (using `match`)	`match2` (using `if`/`else`)
match1: subq $56, %rsp movzbl %dil, %eax leaq .LJTI0_0(%rip), %rcx movslq (%rcx,%rax,4), %rax addq %rcx, %rax jmpq *%rax	match2: subq $56, %rsp cmpb $6, %dil ja .LBB1_9 movzbl %dil, %eax leaq .LJTI1_0(%rip), %rcx movslq (%rcx,%rax,4), %rax addq %rcx, %rax jmpq *%rax

match1:
    subq   $56, %rsp


    movzbl %dil, %eax
    leaq   .LJTI0_0(%rip), %rcx
    movslq (%rcx,%rax,4), %rax
    addq   %rcx, %rax
    jmpq   *%rax

match2:
    subq   $56, %rsp
    cmpb   $6, %dil
    ja     .LBB1_9
    movzbl %dil, %eax
    leaq   .LJTI1_0(%rip), %rcx
    movslq (%rcx,%rax,4), %rax
    addq   %rcx, %rax
    jmpq   *%rax

The code is testing to see whether the enum discriminant has a value above 6 (cmpb $6, %dil; ja, i.e. "compare, then jump if above"), and if so, is performing a jump. Of course, as a 7-element enum (whose discrminants are internally numbered 0 to 6 inclusive) it can't have a discriminant above 6, so this test will never pass – but the test is still going to be made whenever the function is called, and so it will run very slightly slower, because the test itself will take some amount of time. (Note that Rust knows that a discriminant of this sort of enum can never go above 6 – but LLVM doesn't, so it can't safely declare the block of code to be unreachable.)

Conclusions

In most cases it isn't going to matter, but the compiler has to do a lot more work to handle the if/else chain – it's something of an unnatural construct for the compiler to handle (e.g. it can't syntactically determine that the final else block is unreachable, and has to do a static analysis to work that out). The compiler would prefer to work with the match block – that allows it to jump directly to the correct line of code with a single test if that's fast on the platform, or to perform the tests in a sequence of its choice otherwise – and in fact it will internally attempt to rewrite the if/else chain into a match block. But that's done quite late in the compile, by which point it's possible that the compiler has lost track of exactly what is known about the set of values that the discriminant can have.

Simply writing the match block directly is going to make life a lot easier – it expresses your intent more clearly both to human programmers reading the code, and to the compiiler, which will need to do less work to understand that this is an exhaustive match on all options of the enum. Because the compiler is doing more work to handle the if/else chain, it increases the chance that something will go wrong which causes the code to be optimized imperfectly. (I suspect that the version with match will probably also give you marginally faster compile times.)