Search code examples
rustreferencelifetimealgebraic-data-types

Creating recursive enum -- should I use lifetime references? (Rust)


I want to create an ADT like:

pub enum Predicate<'a>{
    And(Box<&'a Predicate>, Box<&'a Predicate>),
    Neg(Box<&'a Predicate>),
    Bool(LogicOp, Literal, Literal)
}

But apparently this is not the correct way to specify lifetimes.

My question is two-fold:

  1. What's the correct way to define lifetimes here?
  2. Is there a better way to tackle this problem? I don't want to clone everything each time, because everything is immutable, and because I don't want to create multiple copies. I also want to do stuff like
let foo = Predicate::Bool(whatever args);
let and = Predicate::And(&foo, &foo);
let neg = Predicate::Neg(&and);
let bar = Predicate::And(&and, &neg);

and so forth and continue to be able to use foo, and, and neg later on.


Solution

  • If everything really is immutable, then the simplest way to handle this scenario is with Rc/Arc (the only difference between the two is that the latter can be used for objects accessed from multiple threads.) You can define your enum as follows:

    pub enum Predicate {
        And(Rc<Predicate>, Rc<Predicate>),
        Neg(Rc<Predicate>),
        Bool(LogicOp, Literal, Literal)
    }
    

    An Rc is a reference-counted shared pointer to a heap-allocated object. You can clone it to obtain a new pointer to the internal data without any copying, and you will never have to worry about lifetimes: internally, the Rc keeps track of how many references are being held to its object, and automatically deallocates it when this number drops to zero. This bookkeeping incurs a small runtime cost whenever the Rc is cloned or dropped, but greatly simplifies dealing with this sort of shared ownership.

    Of course, the use of an Rc makes it somewhat more verbose to create predicates. The example you give would become:

    let foo = Rc::new(Predicate::Bool(whatever args));
    let and = Rc::new(Predicate::And(foo.clone(), foo.clone()));
    let neg = Rc::new(Predicate::Neg(and.clone()));
    let bar = Rc::new(Predicate::And(and.clone(), neg.clone()));
    

    (Technically not all these clones are necessary if you did not intend to use, say, neg later.)

    One way to ease this boilerplate is to use methods or associated functions on the Predicate type to create pre-Rced values, like so:

    impl Predicate {
        pub fn and(a: &Rc<Predicate>, b: &Rc<Predicate>) -> Rc<Predicate> {
            Rc::new(Predicate::And(a.clone(), b.clone())
        }
    
        pub fn neg(a: &Rc<Predicate>) -> Rc<Predicate> {
            Rc::new(Predicate::Neg(a.clone()))
        }
    
        pub fn boolean(op: LogicOp, a: Literal, b: Literal) -> Rc<Predicate> {
            Rc::new(Predicate::Bool(op, a, b))
        }
    }
    

    With this, your example becomes:

    let foo = Predicate::boolean(whatever args);
    let and = Predicate::and(&foo, &foo);
    let neg = Predicate::neg(&and);
    let bar = Predicate::and(&and, &neg);
    

    There is one unavoidable downside to this approach, though. You cannot match through an Rc without dereferencing it first, which can make working with Rc ADTs somewhat painful. See this question for details.