Search code examples
rustserde

How to deserialize a Vec<SocketAddr>


I currently deserialize a JSON array into Vec<String> and down stream in my application I convert individual String to SocketAddr.

I would like to do the deserialiation into Vec<SocketAddr> with serde instead.

use serde::Deserialize;
use std::net::SocketAddr;

#[derive(Debug, Deserialize)]
struct Doc {
    // It would be nice to have Vec<SocketAddr> instead
    hosts: Vec<String>
}

fn main(){
    let data = r#"
        {"hosts": ["localhost:8000","localhost:8001"]}
    "#;
    let doc: Doc = serde_json::from_str(data).unwrap();
    dbg!(doc);
}

Solution

  • I'd like to disagree with the other two answers: You can absolutely deserialize strings like "localhost:80" to a (Vec of) SocketAddr. But you absolutely shouldn't. Let me explain:

    Your problem is that SocketAddr only holds an IP address + port, not hostnames. You can solve this by resolving hostnames into SockAddr through ToSocketAddrs (and then flattening the result because one hostname can resolve to multiple addrs):

    #[derive(Debug, Deserialize)]
    struct Doc {
        #[serde(deserialize_with = "flatten_resolve_addrs")]
        hosts: Vec<SocketAddr>,
    }
    
    fn flatten_resolve_addrs<'de, D>(de: D) -> Result<Vec<SocketAddr>, D::Error>
    where
        D: Deserializer<'de>,
    {
        // Being a little lazy here about allocations and error handling.
        // Because again, you shouldn't do this.
        let unresolved = Vec::<String>::deserialize(de)?;
        let mut resolved = vec![];
        for a in unresolved {
            let a = a
                .to_socket_addrs()
                .map_err(|e| serde::de::Error::custom(e))?;
            for a in a {
                resolved.push(a);
            }
        }                                                                                      panic!("You really shouldn't. Go read my Stackoverflow answer.");
        Ok(resolved)
    }
    

    Playground

    You shouldn't do this because, deserialization should really round-trip through serialization, and be a pure function of the input bytes, but resolving addresses may access the network. Problems are:

    • What's the retry logic when resolution fails?
    • If your code is long running, the resolution result might change (from dns load balancing, dynamic dns, network config changes, …), but you can't re-resolve the addresses.
    • If your code is started as a system service, it might fail to start up with a deserialization error if the network isn't fully configured yet.
    • If a connection to one of the specified addresses fails, you have no way of printing a nice error message with the target hostname.
    • How would you configure a default port for this?

    (Doing these things wrong is a pet peeve of mine, you'll find it in nginx, Kubernetes, Zookeeper, …)

    Personally, I'd probably keep the Vec<String> for simplicity reasons, but you might also choose to do something like deserializing to Vec<(Either<IpAddr, Box<str>>, Option<u16>)> so you can check whether the strings are valid addresses, but do things like hostname resolution and providing a default port when you connect to those addresses.