Search code examples
c#jsonlinqjson.netgoogle-speech-api

Parsing multiple objects using Jobject in Newtonsoft.Json.Linq


I am trying to parse the result from the google speech to text API. The json response is :

{"result":[]}
{"result":[
          {"alternative":[
                         {"transcript":"hello Google how are you     feeling","confidence":0.96274596},
                         {"transcript":"hello Google how are you today","confidence":0.97388196},
                         {"transcript":"hello Google how are you picking","confidence":0.97388196},
                         {"transcript":"hello Google how are you kidding","confidence":0.97388196}
                         ]
         ,"final":true}]
,"result_index":0
}

Now i am trying to parse it through JObject. The problem is occurring in parsing the Result object which is appearing twice so, how do i parse the second Result object. Here is my code which i am trying is :

              StreamReader SR_Response = new StreamReader(HWR_Response.GetResponseStream());
              Console.WriteLine(SR_Response.ReadToEnd()+SR_Response.ToString());
              String json_response = SR_Response.ReadToEnd() + SR_Response.ToString();
              JObject joo = JObject.Parse(json_response);
              JArray ja = (JArray)joo["result"];

                        foreach (JObject o in ja)
                        {
                            JArray ja2 = (JArray)o["alternative"];
                            foreach (JObject h in ja2)
                            {
                                Console.WriteLine(h["transcript"]);
                            }
                        }

Next solution i tried using deserialize object code is:

                string responseFromServer = (SR_Response.ReadToEnd());
                String[] jsons = responseFromServer.Split('\n');
                String text = "";
                foreach (String j in jsons)
                {
                    dynamic jsonObject = JsonConvert.DeserializeObject(j);
                    if (jsonObject == null || jsonObject.result.Count <= 0)
                    {
                        continue;
                    }
                    Console.WriteLine((string)jsonObject["result"]["alternative"][0]["transcript"]);
                    text = jsonObject.result[0].alternative[0].transcript;
                }
                Console.WriteLine("MESSAGE : "+text); 

Solution

  • What you have is a series of JSON root objects concatenated together into a single stream. As explained in Read Multiple Fragments With JsonReader such a stream can be deserialized by setting JsonReader.SupportMultipleContent = true. Thus, to deserialize your stream, you should first introduce the following extension methods:

    public static class JsonExtensions
    {
        public static IEnumerable<T> DeserializeObjects<T>(Stream stream, JsonSerializerSettings settings = null)
        {
            var reader = new StreamReader(stream); // Caller should dispose
            return DeserializeObjects<T>(reader, settings);
        }
    
        public static IEnumerable<T> DeserializeObjects<T>(TextReader textReader, JsonSerializerSettings settings = null)
        {
            var ser = JsonSerializer.CreateDefault(settings);
            var reader = new JsonTextReader(textReader); // Caller should dispose
    
            reader.SupportMultipleContent = true;
    
            while (reader.Read())
            {
                if (reader.TokenType == JsonToken.None || reader.TokenType == JsonToken.Undefined || reader.TokenType == JsonToken.Comment)
                    continue;
                yield return ser.Deserialize<T>(reader);
            }
        }
    }
    

    Next, using a code-generation utility such as http://json2csharp.com/, generate c# classes for a single JSON root object, like so:

    public class Alternative
    {
        public string transcript { get; set; }
        public double confidence { get; set; }
    }
    
    public class Result
    {
        public List<Alternative> alternative { get; set; }
        public bool final { get; set; }
    }
    
    public class RootObject
    {
        public List<Result> result { get; set; }
        public int result_index { get; set; }
    }
    

    And deserialize as follows:

    List<RootObject> results;
    using (var stream = HWR_Response.GetResonseStream())
    {
        results = JsonExtensions.DeserializeObjects<RootObject>(stream).ToList();
    }
    

    Having done this you can use standard c# programming techniques such as Linq to enumerate the transcript values, such as:

    var transcripts = results
        .SelectMany(r => r.result)
        .SelectMany(r => r.alternative)
        .Select(a => a.transcript)
        .ToList();
    

    If you don't want define a fixed data model for your JSON collection, you can deserialize directly to a list of JObject like so:

    List<JObject> objs;
    using (var stream = HWR_Response.GetResonseStream())
    {
        objs = JsonExtensions.DeserializeObjects<JObject>(stream).ToList();
    }
    

    Then you can use SelectTokens() to select the values of all the "transcript" properties nested inside each object:

    var transcripts = objs
        // The following uses the JSONPath recursive descent operator ".." to pick out all properties named "transcript".
        .SelectMany(o => o.SelectTokens("..transcript")) 
        .Select(t => t.ToString())
        .ToList();
    

    Updated sample fiddle showing both options.