Search code examples
rustenumscompilationcompiler-optimizationpacking

Can the Rust compiler pack enums carrying bools


Suppose I've got the following code, which returns the name of an animal, depending also on whether or not it's an adult:

// True for adults, false for children
enum Animal {
    Cat(bool),
    Dog(bool),
    Turtle(bool),
}
fn animal_name(animal: Animal) -> &str {
    match animal {
        Cat(true) => "cat",
        Cat(false) => "kitten",
        Dog(true) => "dog",
        Dog(false) => "puppy",
        Turtle(_) => "turtle",
    }
}

If I compile this, the size of the Animal enum is 2 bytes, one for the type, one for the bool (rustc 1.84 as of writing). If I write the possible states explicitly instead, the program is logically identical, but the enum compiles down to just one byte as expected:

enum Animal {
    Cat,
    Kitten,
    Dog,
    Puppy
    Turtle,
}
fn animal_name(animal: Animal) -> &str {
    match animal {
        Cat => "cat",
        Kitten => "kitten",
        Dog => "dog",
        Puppy => "puppy",
        Turtle => "turtle",
    }
}

The Rust compiler already exhaustively verifies each possible enum variation for match statements and will complain if one is missing, so it must already be capable of enumerating all possible variations.

Is there some good reason it can't represent the non-explicit as more states automatically? The compiler doesn't make guarantees on the literal values the variants take anyway (unless you write them explicitly).

Godbolt containing examples of both, adding the optimization flag -O doesn't fix it as far as I can tell, but I'm no assembly expert.


Solution

  • The compiler can not make that optimization because then it would hold an invalid bool.

    It seems tautological to even mention but Cat(bool) means that enum variant holds a bool. A bool has a very specific representation. From the reference:

    The value false has the bit pattern 0x00 and the value true has the bit pattern 0x01. It is undefined behavior for an object with the boolean type to have any other bit pattern.

    The representation of bool can be used to hold the enum discriminant, but only when that discriminant is zero or its not holding a bool:

    enum Animal {
        Cat(bool),
        Dog,
        Turtle,
    }
    
    fn main() {
        dbg!(std::mem::size_of::<Animal>());
    }
    
    [src/main.rs:8:5] std::mem::size_of::<Animal>() = 1