Search code examples
rustenumsjson-deserializationserde

Deserialization of json with serde by a numerical value as type identifier


I'm quite new to rust and come from an OOP background. So, maybe I misunderstood some rust basics.

I want to parse a fixed json-structure with serde. This structure represents one of different messages types. Each message has a numeric type attribute to distinguish it. The exact structure of the individual message types varies mostly, but they can also be the same.

{"type": 1, "sender_id": 4, "name": "sender", ...}
{"type": 2, "sender_id": 5, "measurement": 3.1415, ...}
{"type": 3, "sender_id": 6, "measurement": 13.37, ...}
...

First of all, I defined an enum to distinguish between message types also a struct for each type of message without a field storing the type.

#[derive(Debug, Serialize, Deserialize)]
#[serde(tag = "type")]
enum Message {
    T1(Type1),
    T2(Type2),
    T3(Type3),
    // ...
}

#[derive(Debug, Serialize, Deserialize)]
struct Type1 {
    sender_id: u32,
    name: String,
    // ...
}
#[derive(Debug, Serialize, Deserialize)]
struct Type2 {
    sender_id: u32,
    measurement: f64,
    // ...
}
#[derive(Debug, Serialize, Deserialize)]
struct Type3 {
    sender_id: u32,
    measurement: f64,
    // ...
}
// ...

When I try to turn a string to a Message object, I get an error.

let message = r#"{"type":1,"sender_id":123456789,"name":"sender"}"#;
let message: Message = serde_json::from_str(message)?; // error here
// Error: Custom { kind: InvalidData, error: Error("invalid type: integer `1`, expected variant identifier", line: 1, column: 9) }

So, as I understood, serde tries to figure out the type of the current message but it needs a string for that. I also tried to write my own deserialize()-function. I tried to get the numerical value of the corresponding type-key and wanted to create the specific object by the type value.

How I have to implement the deserialize() to extract the type of the message and create the specific message object? Is it possible to write this without writing a deserialize()-function for each Type1/2/3/... struct?

impl<'de> Deserialize<'de> for Message {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
                  where D: Deserializer<'de>,
{
    // which functions I have to call?
}

Or is there a better solution to achieve my deserialization?

I prepared a playground for this issue: Playground


Solution

  • Serde doesn't support integer tags yet (see issue #745).


    If you're able to change what's producing the data, then if you're able to change type into a string, i.e. "1" instead of 1. Then you can get it working simply using #[serde(rename)].

    #[derive(Debug, Serialize, Deserialize)]
    #[serde(tag = "type")]
    enum Message {
        #[serde(rename = "1")]
        T1(Type1),
        #[serde(rename = "2")]
        T2(Type2),
        #[serde(rename = "3")]
        T3(Type3),
        // ...
    }
    

    If that's not an option, then you indeed need to create a custom deserializer. The shortest in terms of code, is likely to deserialize into a serde_json::Value, and then match on the type, and deserialize the serde_json::Value into the correct Type{1,2,3}.

    use serde_json::Value;
    
    impl<'de> serde::Deserialize<'de> for Message {
        fn deserialize<D: serde::Deserializer<'de>>(d: D) -> Result<Self, D::Error> {
            let value = Value::deserialize(d)?;
    
            Ok(match value.get("type").and_then(Value::as_u64).unwrap() {
                1 => Message::T1(Type1::deserialize(value).unwrap()),
                2 => Message::T2(Type2::deserialize(value).unwrap()),
                3 => Message::T3(Type3::deserialize(value).unwrap()),
                type_ => panic!("unsupported type {:?}", type_),
            })
        }
    }
    

    You'll probably want to perform some proper error handling, instead of unwrapping and panicking.


    If you need serialization as well, then you will likewise need a custom serializer. For this you could create a new type to serialize into, as you cannot use Message.

    use serde::Serializer;
    
    impl Serialize for Message {
        fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
        where
            S: Serializer,
        {
            #[derive(Serialize)]
            #[serde(untagged)]
            enum Message_<'a> {
                T1(&'a Type1),
                T2(&'a Type2),
                T3(&'a Type3),
            }
    
            #[derive(Serialize)]
            struct TypedMessage<'a> {
                #[serde(rename = "type")]
                t: u64,
                #[serde(flatten)]
                msg: Message_<'a>,
            }
    
            let msg = match self {
                Message::T1(t) => TypedMessage { t: 1, msg: Message_::T1(t) },
                Message::T2(t) => TypedMessage { t: 2, msg: Message_::T2(t) },
                Message::T3(t) => TypedMessage { t: 3, msg: Message_::T3(t) },
            };
            msg.serialize(serializer)
        }
    }
    

    When using #[serde(flatten)], then it uses serde::private::ser::FlatMapSerializer, which is hidden from the documentation. In place of creating new types, you could use SerializeMap and FlatMapSerializer.

    However, be warned, given it's undocumented, then any future release of serde could break your code if you're using FlatMapSerializer directly.

    use serde::{private::ser::FlatMapSerializer, ser::SerializeMap, Serializer};
    
    impl Serialize for Message {
        fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
        where
            S: Serializer,
        {
            let mut s = serializer.serialize_map(None)?;
    
            let type_ = &match self {
                Message::T1(_) => 1,
                Message::T2(_) => 2,
                Message::T3(_) => 3,
            };
            s.serialize_entry("type", &type_)?;
    
            match self {
                Message::T1(t) => t.serialize(FlatMapSerializer(&mut s))?,
                Message::T2(t) => t.serialize(FlatMapSerializer(&mut s))?,
                Message::T3(t) => t.serialize(FlatMapSerializer(&mut s))?,
            }
    
            s.end()
        }
    }