Search code examples
data-structuresrustlinked-listcons

How does Cons save data (Rust Linked List)


I have been learning Rust and decided doing some basic type would help to learn the language at a deeper level.

The "Rust by Example" page for linked lists states

Ln 4: // Cons: Tuple struct that wraps an element and a pointer to the next node

Which I think means that it is recursively creating the list by always making an empty node to populate with Cons.

enum linkedList
{
    Head(Head), // Front Pointer and list metrics
    Cons(Arc<linkedList>, isize), //(Data, Next Value) Cons is apparently a LISP construct
    Tail(isize), // Rear Pointer
    Nil //Used to drop the stream
}

My real question is what is the underlying mechanism that allows data to be stored in the Arc<linkedList> node? I thought it would take a generic (<T>) to store the data on the list but apparently this is incorrect.

p.s I am under the impression ARC and BOX smart pointers are interchangeable but used for different purposes. I was trying to make a thread safe version of a single ended rollover safe linked list, sort of like a circular queue.


Solution

  • Your implementation slightly deviates from the standard definition of Cons lists. The straight-forward definition (similar to how you would do it in Lisp) is

    type Data = isize;
    
    enum List {
        Nil,
        Cons(Box<List>, Data),
    }
    

    As you can see, the Cons variant is made up of the nested list and this nodes data element. In your case, each node has an isize.

    If you have an Arc<List> or Box<List>, this nested List object could be the Cons variant too and carry another isize. Arc and Box don't care about what they are pointing to.


    There are some things that are not quite idiomatic. Having both the Tail and Nil variant doesn't make much sense, because now you have two ways to signal the end of the list. Similarly, having the Head be a variant of the list is strange, because the head of the list is only at the beginning, your implementation allows for a Head variant in the middle of the list though.

    It is preferable to not have an extra Nil node to signal the end of the list. Instead, the last node knows that it is the last (like your Tail variant) so that you don't have an extra allocation for the empty node. This is what I would perceive as an idiomatic definition of a singly linked list in Rust:

    struct List {
        // ...
        // extra info for list head
        // ...
    
        first_node: Option<Box<ListNode>>,
    }
    
    struct ListNode {
        value: Data,
        next: Option<Box<ListNode>>,
    }
    

    To make this generic, we simply have to remove our type alias Data from earlier and make it a generic parameter instead:

    struct ListNode<T> {
        value: T,
        next: Option<Box<ListNode<T>>>,
    }
    

    About Box and Arc: Box is an owned pointer to a value on the heap, that means only this one box owns the memory it points to. Arc is a thread safe version of Rc, which is a reference-counted heap value. Multiple Rc-pointers to this memory can exist and read it, the number of these is counted. Thus, the ownership may be shared.

    You should pick which to use, based on whether or not you want to create more reference-counted pointers to the nodes (note that you probably want either Rc<RefCell<Node>> or Arc<Mutex<Node>> to also mutate the list after creating it). If you only ever have the head of the list and want to iterate over it, pick Box instead.