Search code examples
rustserde

How to implement serde Deserialize for struct that references its parent?


I want access a parent struct from a children, so I'm trying to store a reference to the parent Chain in the child's chain.blocks. But I'm encountering an error:

// Chain
#[derive(Default, Serialize, Deserialize)]
pub struct Chain<'a> {
    pub blocks: Vec<Block<'a>>,
}

// Block
#[derive(Deserialize, Serialize)]
pub struct Block<'a> {
    pub id: String,
    pub prev_hash: String,
    pub timestamp: u64,
    pub nonce: i32,
    pub proof: String,
    pub miner: String,
    pub documents: Vec<String>,
    pub merkel_root: String,
    chain: &'a Chain<'a>,
}

impl<'a> Default for Block<'a> {
    fn default() -> Self {
        Self {
            ..Default::default()
        }
    }
}
the trait bound `&'a chain::Chain<'a>: Deserialize<'_>` is not satisfied
the following implementations were found:
  <chain::Chain<'a> as Deserialize<'de>>rustcE0277

I already have #[derive(Deserialize)] on both, so I'm not sure what else I could do.


Solution

  • Problems

    What you're trying to make is a self-referential structure, which is troublesome in Rust. See these answers, for example. It's not necessarily impossible, but it would be best if you try to find a different model for your data.

    Also: serde's derive mechanism doesn't care whether your field is pub or not, it will deserialize to it anyway. And it has no way to know that you want chain to contain a reference to its parent. In fact, it will generate a deserializer for block which can be used stand-alone. If you do wat @PitaJ suggests (chain: Box<Chain>), you'll get a deserializer that expects data like

    {
      "id": "foo",
      …
      "merkel_root": "You're thinking of Merkle, Merkel is the German chancellor a. D.",
      "chain": {
        "blocks": [
          {
            "id": "bar",
            …
            "chain": {
              "blocks": []
            }
          }
        ]
      }
    }
    

    Lastly:

    impl<'a> Default for Block<'a> {
        fn default() -> Self {
            Self {
                ..Default::default()
            }
        }
    }
    

    is infinite recursion. But rustc would have warned you about this if you'd managed to get past the compiler errors.

    warning: function cannot return without recursing
      --> src/lib.rs:25:5
       |
    25 |     fn default() -> Self {
       |     ^^^^^^^^^^^^^^^^^^^^ cannot return without recursing
    26 |         Self {
    27 |             ..Default::default()
       |               ------------------ recursive call site
       |
    

    If you absolutely want this

    … against better advice, then you can get something close with reference counting (and my favourite serde trick):

    First, you need to solve the problem that serde won't know what to store in chain in Block. You could make it an Option and mark it #[serde(skip)], but I prefer having a second struct with only the fields you actually want to be deserialized:

    #[derive(Deserialize)]
    pub struct ChainSerde {
        pub blocks: Vec<BlockSerde>,
    }
    
    #[derive(Deserialize)]
    pub struct BlockSerde {
        pub id: String,
        // … - doesn't contain chain
    }
    

    The actual struct you want to work with then looks like this

    #[derive(Deserialize, Serialize, Debug)]
    #[serde(from = "ChainSerde")]
    pub struct Chain {
        pub blocks: Rc<RefCell<Vec<Block>>>,
    }
    
    #[derive(Debug, Serialize)]
    pub struct Block {
        pub id: String,
        // …
        #[serde(skip)] // Serialization would crash without
        // If you wanted Chain instead of Vec<Block>, you'd need another
        // #[serde(transparent, from = "ChainSerde")]
        // struct ChainParent(Rc<RefCell<Chain>>)
        chain: Weak<RefCell<Vec<Block>>>,
    }
    

    Now, all you need to do is to tell serde how to turn the deserialized struct into the struct you actually want to work with.

    impl From<ChainSerde> for Chain {
        fn from(t: ChainSerde) -> Self {
            let b: Rc<RefCell<Vec<Block>>> = Default::default();
            let bc: Vec<Block> = t
                .blocks
                .into_iter()
                .map(|block| Block {
                    id: block.id,
                    chain: Rc::downgrade(&b),
                })
                .collect::<Vec<Block>>();
            *RefCell::borrow_mut(&b) = bc;
            Chain { blocks: b }
        }
    }
    

    If you want to wait for Rust 1.60, you can do it a bit more neatly and without the RefCell by using Rc::new_cyclic:

    #[derive(Deserialize, Debug)]
    #[serde(from = "ChainSerde")]
    pub struct ChainNightly {
        pub blocks: Rc<Vec<BlockNightly>>,
    }
    
    #[derive(Debug)]
    pub struct BlockNightly {
        pub id: String,
        // …
        chain: Weak<Vec<BlockNightly>>,
    }
    
    impl From<ChainSerde> for ChainNightly {
        fn from(t: ChainSerde) -> Self {
            ChainNightly {
                blocks: Rc::new_cyclic(|blocks| {
                    t.blocks
                        .into_iter()
                        .map(|block| BlockNightly {
                            id: block.id,
                            chain: blocks.clone(),
                        })
                        .collect()
                }),
            }
        }
    }
    

    Playground