Search code examples
c#json.netjsonpathjsonreaderjpath

JsonPath with JsonTextReader: Token at a Time


I am having an issue with JsonPath working differently when loading token (.Load) at a time using JsonTextReader versus loading the entire JSON using ReadFrom. Here is an example: JSON: Path="[*].person" Method=SelectTokens(path)

 [
  {
    "person": {
      "personid": 123456
    }
  },
  {
    "person": {
      "personid": 798
    }
  }
]

When using .ReadFrom, it'll return the proper 2 elements. If I use .Load though, it'll return 0 elements. However, if I change the path to "person", .ReadFrom returns 0 elements while .Load returns 2 elements.

As a fix, I could change the path so that it'll remove up to the first "." i.e. path = substring(path.index(".")+1); however, this feels more of a hack than a proper fix. I would, of course, also need to ensure that the JSON is an array, but in most of my cases, it would be.

So finally, I am trying to learn how to use JSON Path with arrays when loading a token at a time. Any recommendations?

Full Code

Full JSON


Solution

  • What is happening in the code you have linked to is it reads tokens until it encounters an object, it then loads the a JToken from this object, which reads ahead to the end of this object. So what you end up with is a JToken per item in the root array. You can then for each JToken call:

    token.SelectTokens("person").OfType<JObject>()
    

    cause you know the property contains an object.

    That is the equivalent of doing "[*].person" JsonPath on the whole parsed JSON.

    I hope I have understood your question correctly. If not, please let me know =)

    Update:

    Based on your comments I understand what you are after. What you could do is create a method like this:

    public IEnumerable<JToken> GetTokensByPath(TextReader tr, string path)
    {
        // do our best to convert the path to a RegEx
        var regex = new Regex(path.Replace("[*]", @"\[[0-9]*\]"));
        using (var reader = new JsonTextReader(tr))
        {
            while (reader.Read())
            {
                if (regex.IsMatch(reader.Path))
                    yield return JToken.Load(reader);
            }
        }
    }
    

    I am matching the path based on the JSON path input, but we need to try and handle all of the various JSON path grammars, at the moment I'm only support *. This approach will be useful when you have a massive file, with a deep JSON path selector, you'll keep the stream open longer if you enumerate slowly, but you will have a much lower peak memory usage.

    I hope this helps.