Search code examples
jsonjson.netstanford-nlp

TokensRegex json response


The TokensRegex response (web api) is as follows with an array list shaped with numerical order. Is there way to change the format, or any reason it must be that way? Otherwise it is hard to deserialize it or write a query on the result.

{
  "sentences": [
    {
      "0": {
        "text": "huge success",
        "begin": 4,
        "end": 6
      },
      "1": {
        "text": "new venture",
        "begin": 17,
        "end": 19
      },
      "2": {
        "text": "comfort zone",
        "begin": 26,
        "end": 28
      },
      "length": 3
    }
  ]
}

Solution

  • You can use Json.Net's LINQ-to-JSON API to deserialize this JSON into something sensible.

    First, define a class Phrase like this:

    class Phrase
    {
        public string Text { get; set; }
        public int Begin { get; set; }
        public int End { get; set; }
    }
    

    Then you can do this to get a list of phrases:

    JObject obj = JObject.Parse(json);
    
    List<Phrase> phrases = 
        obj["sentences"][0]
            .Children<JProperty>()
            .Where(jp => jp.Value.Type == JTokenType.Object)
            .Select(jp => jp.Value.ToObject<Phrase>())
            .ToList();
    

    Fiddle: https://dotnetfiddle.net/hU4iTp