Search code examples
c#jsonjson.netdeserializationtelegram

Deserialize JSON-File with multiple datatypes for a key


I wanted to analyze Telegram-Chats so I exported a chat in JSON format and wanted to deserialize it into my analyzing software.

    {
      "id": 397910,
      "type": "message",
      "date": "2018-02-21T10:27:59",
      "edited": "1970-01-01T01:00:00",
      "from": "Username",
      "from_id": 39033284,
      "text": "Some Text"
    }

So I've used this simple code to read the JSON

    List<JSONObject> jsonObjects = JsonConvert.DeserializeObject<List<JSONObject>>(File.ReadAllText(openFileDialog.FileName));

    public class JSONObject
    {
       public int ID;
       public string type;
       public string date;
       public string edited;
       public string from;
       public int fromID;
       public string photo;
       public int width;
       public int height;
       public string text;
    }

This went very well for the first 525 datasets but afterwards, I had trouble deserializing the data because of "consistency issues". The Datatype of the text sometimes changes to an array.

    {
       "id": 397911,
       "type": "message",
       "date": "2018-02-21T10:31:47",
       "edited": "1970-01-01T01:00:00",
       "from": "Username",
       "from_id": 272964614,
       "text": [
          "Some Text ",
          {
             "type": "mention",
             "text": "@school"
          },
          " Some Text"
       ]
    }

Also, I found this dataset

    {
       "id": 397904,
       "type": "message",
       "date": "2018-02-21T10:18:12",
       "edited": "1970-01-01T01:00:00",
       "from": "Username",
       "from_id": 39033284,
       "text": [
          {
             "type": "link",
             "text": "google.com"
          },
          "\n\nSome Text"
        ]
    }

I don't know how I deserialize the data when it shows this kind of inconsistency.


Solution

  • as your property is complex, you'll need to write your own de-serialization logic.

    Here's mine, but it's just an example :

    • First of all, your text property seems to be
      • A single value
      • Or an array of values

    In this case, I'll go for an "always list" result, the case with a single value will just be a list with one entry.

    public List<TextProperty> text;
    
    • The value can also be
      • A single string value
      • An object with the string value and a meta datum (text type)

    Again, I'll go for an "always object" with no type if it's string only

    public class TextProperty
    {
        public string text { get; set; }
        public string type { get; set; }
    }
    

    Then you have to make your own Converter to handle this, you just have to inherit from JsonConverter and implement the logic

    public class TextPropertyConverter : JsonConverter
    {
        public override void WriteJson(JsonWriter writer, object value, JsonSerializer serializer)
        {
            throw new NotImplementedException(); // not covered here
        }
    
        // A value can be either single string or object
        // Return a TextProperty in both cases
        private TextProperty ParseValue(JToken value) 
        {
            switch(value.Type)
            {
                case JTokenType.String:
                    return new TextProperty { text = value.ToObject<string>() };
    
                case JTokenType.Object:
                    return value.ToObject<TextProperty>();
    
                default:
                    return null;
            }
        }
    
        public override object ReadJson(JsonReader reader, Type objectType, object existingValue, JsonSerializer serializer)
        {
            // You'll start either with a single value (we'll convert to list of one value) or an array (list of several values then)
            switch(reader.TokenType)
            {
                case JsonToken.String:
                case JsonToken.StartObject:
                    return new List<TextProperty> { ParseValue(JToken.Load(reader)) };
    
                case JsonToken.StartArray:
                    var a = JArray.Load(reader);
                    var l = new List<TextProperty>();
                    foreach(var v in a)
                        l.Add(ParseValue(v));
                    return l;
    
                default:
                    return null;
            }
        }
    
        public override bool CanConvert(Type objectType) => false;
    }
    

    I think all cases should be covered

    To use it, simply add the JsonConverter attribute to the target property

    public class JSONObject
    {
        public int id;
        public string type;
        public string date;
        public string edited;
        public string from;
        public int from_id;
        public string photo;
        public int width;
        public int height;
    
        [JsonConverter(typeof(TextPropertyConverter))]
        public List<TextProperty> text;
    }
    

    And then test it :

    static void Main(string[] args)
        {
            string json = @"
            [
                {
                  ""id"": 397910,
                  ""type"": ""message"",
                  ""date"": ""2018-02-21T10:27:59"",
                  ""edited"": ""1970-01-01T01:00:00"",
                  ""from"": ""Username"",
                  ""from_id"": 39033284,
                  ""text"": ""Some Text""
                },
    
                {
                   ""id"": 397911,
                   ""type"": ""message"",
                   ""date"": ""2018-02-21T10:31:47"",
                   ""edited"": ""1970-01-01T01:00:00"",
                   ""from"": ""Username"",
                   ""from_id"": 272964614,
                   ""text"": [
                      ""Some Text "",
                      {
                         ""type"": ""mention"",
                         ""text"": ""@school""
                      },
                      "" Some Text""
                   ]
                }
            ]";
    
            List<JSONObject> jsonObjects = JsonConvert.DeserializeObject<List<JSONObject>>(json);
    
            Console.Read();
        }
    

    Here's the results :

    Tada