Search code examples
rustserde

How do I transform special values into Option::None when using Serde to deserialize?


I'm parsing data into:

struct Data {
    field1: Option<f32>,
    field2: Option<u64>,
    // more ...
}

The problem is that my input data format formats what would be a None in Rust as "n/a".

How do tell Serde that an Option<T> should be None for the specific string n/a, as opposed to an error? We can assume that this doesn't apply to a String.

This isn't the same question as How to deserialize "NaN" as `nan` with serde_json? because that's creating an f32 from a special value whereas my question is creating an Option<Anything> from a special value. It's also not How to transform fields during deserialization using Serde? as that still concerns a specific type.


Solution

  • You can write your own deserialization function that handles this case:

    use serde::de::Deserializer;
    use serde::Deserialize;
    
    // custom deserializer function
    fn deserialize_maybe_nan<'de, D, T: Deserialize<'de>>(
        deserializer: D,
    ) -> Result<Option<T>, D::Error>
    where
        D: Deserializer<'de>,
    {
        // we define a local enum type inside of the function
        // because it is untagged, serde will deserialize as the first variant
        // that it can
        #[derive(Deserialize)]
        #[serde(untagged)]
        enum MaybeNA<U> {
            // if it can be parsed as Option<T>, it will be
            Value(Option<U>),
            // otherwise try parsing as a string
            NAString(String),
        }
    
        // deserialize into local enum
        let value: MaybeNA<T> = Deserialize::deserialize(deserializer)?;
        match value {
            // if parsed as T or None, return that
            MaybeNA::Value(value) => Ok(value),
    
            // otherwise, if value is string an "n/a", return None
            // (and fail if it is any other string)
            MaybeNA::NAString(string) => {
                if string == "n/a" {
                    Ok(None)
                } else {
                    Err(serde::de::Error::custom("Unexpected string"))
                }
            }
        }
    }
    

    Then you can mark your fields with #[serde(default, deserialize_with = "deserialize_maybe_nan")] to use this function instead of the default function:

    #[derive(Deserialize)]
    struct Data {
        #[serde(default, deserialize_with = "deserialize_maybe_nan")]
        field1: Option<f32>,
        #[serde(default, deserialize_with = "deserialize_maybe_nan")]
        field2: Option<u64>,
        // more ...
    }
    

    Working playground example

    More information in the documentation: