Search code examples
rustserde-json

Deserializing a JSON field with multiple elements from Strings to a Vec of Vec<u8>s


I have a json structure that follows the following example:

{
    "title": "This is the title of the document",
    "content": "This is a much longer entry containing the full content of a document",
    "version_author": "5GrwvaEF5zXb26Fz9rcQpDWS57CtERHpNehXCPcNoHGKutQY",
    "predecessor": "Qme7ss3ARVgxv6rXqVPiikMJ8u2NLgmgszg13pYrDKEoiu",
    "co_authors": [
        "5GrwvaEF5zXb26Fz9rcQpDWS57CtERHpNehXCPcNoHGKutQY",
        "5FHneW46xGXgs5mUiveU4sbTyGBzmstUspZC92UhjJM694ty",
        "5FLSigC9HGRKVhB9FiEo4Y3koPsNmBmLJbpXg2mp1hXcS59Y"
    ]
}

I'm using serde_json to deserialize my json files into the following struct in rust:

#[derive(Deserialize)]
struct IpfsConsequence {
    // Specify our own deserializing function to convert JSON string to vector of bytes
    #[serde(deserialize_with = "de_string_to_bytes")]
    title: Vec<u8>,
    #[serde(deserialize_with = "de_string_to_bytes")]
    content: Vec<u8>,
    #[serde(deserialize_with = "de_string_to_bytes")]
    version_author: Vec<u8>,
    #[serde(deserialize_with = "de_string_to_bytes")]
    predecessor: Vec<u8>,
    co_authors: Vec<String>,
}

pub fn de_string_to_bytes<'de, D>(de: D) -> Result<Vec<u8>, D::Error>
where
D: Deserializer<'de>,
{
    let s: &str = Deserialize::deserialize(de)?;
    Ok(s.as_bytes().to_vec())
}

This compiles and I could write me code to use it perfectly well. But using Vec type for co_authors feels a bit messy. I would prefer to use type Vec<Vec> but I can't find a way to do this.

serde_json is smart in it's ability to deserialize a field with multiple values into a Vec. I want it to keep doing that for my co_authors field. But then I would like it to use my "de_string_to_bytes" deserializer to convert each of the values within the co_authors field to Vecs.

As I can only apply the #[serde(deserialize_with = "de_string_to_bytes")] macro to an entire field in my struct, if I do that it will override the default serde_json behaviour of deserializing a field with multiple values into a Vec, which I don't want to override.


Solution

  • You can define a similar function that decodes a Vec<&str> to Vec<Vec<u8>>:

    pub fn de_vec_string_to_bytes<'de, D>(de: D) -> Result<Vec<Vec<u8>>, D::Error>
    where
        D: Deserializer<'de>,
    {
        let v: Vec<&str> = Deserialize::deserialize(de)?;
        Ok(v.into_iter().map(|s| s.as_bytes().to_vec()).collect())
    }
    

    Playground

    If you want to use this inside even more data structures, it might be better to create a new type that wraps Vec<u8> that implements Deserialize, and then use that type everywhere:

    use std::collections::HashMap;
    
    use serde::{Deserialize, Deserializer};
    
    #[derive(Deserialize)]
    struct IpfsConsequence {
        title: Bytes,
        co_authors: Vec<Bytes>,
        maybe_a_map_too: HashMap<String, Bytes>,
    }
    
    #[derive(Deserialize)]
    struct Bytes(#[serde(deserialize_with = "de_string_to_bytes")] Vec<u8>);
    
    pub fn de_string_to_bytes<'de, D>(de: D) -> Result<Vec<u8>, D::Error>
    where
        D: Deserializer<'de>,
    {
        let s: &str = Deserialize::deserialize(de)?;
        Ok(s.as_bytes().to_vec())
    }
    

    Playground