Search code examples
c#json.net

Convert JArray to list, keeping only one nested JToken


I have a JArray that looks like this:

[
    {
        "1": "A",
        "2": "B",
        "3": {
            "5": "D",
            "KeyICareAbout": "C"
        }
    },
    {
        "1": "E",
        "2": "F",
        "3": {
            "5": "H",
            "KeyICareAbout": "G"
        }
    }
]

I only care about the value of KeyICareAbout. How can I remove all other tokens and convert the array into a list? The array might be big, so I don't think looping would be efficient. I'm currently doing it this way, but it feels very convoluted and brittle. I feel like I'm overlooking an easy solution

//flattens the JArray by keeping only the values and converts it to Lookup. Uses the last element of the path as the key.
var lookup = array.Descendants().OfType<JValue>().ToLookup(jv => jv.Path.Substring(jv.Path.LastIndexOf(".")+1), jv => jv.Value) 

//converts lookup to list
var list = lookup["KeyICareAbout"].ToList()  

Solution

  • You can improve the efficiency of your code as follows:

    var key = "KeyICareAbout";
    var list = array.Descendants()
        .OfType<JProperty>() // Look for all properties
        .Where(p => p.Name == key) // With the name KeyICareAbout
        .Select(p => p.Value) // And select their value.
        .ToList();
    

    Notes:

    • ToLookup() builds a hashed lookup table for all keys encountered in the JSON, which should be roughly n log(n) * k with n the number of unique keys and k the average number of values per key. You only care about the value of one specific key, so you'll get better performance by filtering out irrelevant keys with a Where() expression and skipping creation of the lookup table entirely.

      If you need to your result JSON to contain the values of many keys, a lookup table would make sense.

    • Rather than parsing the Path string, you can just get the property name from the immediate parent JProperty of the values you want.

    Demo fiddle #1 here.

    In your question you state The array might be big. If the JSON is extremely large (many MB in size) you might want to adopt a streaming solution and only load the values of "KeyICareAbout" into memory. The following does that:

    var key = "KeyICareAbout";
    
    using var stream = File.OpenRead(fileName);
    using var textReader = new StreamReader(stream, Encoding.UTF8);
    using var reader = new JsonTextReader(textReader);
    
    var list = new List<JToken>();
    while (reader.Read())
    {
        if (reader.TokenType == JsonToken.PropertyName && (string)reader.Value == key)
        {
            list.Add(JToken.Load(reader.ReadToContentAndAssert()));
        }
    }
    

    Using extension methods from:

    public static partial class JsonExtensions
    {
        public static JsonReader AssertTokenType(this JsonReader reader, JsonToken tokenType) => 
            reader.TokenType == tokenType ? reader : throw new JsonSerializationException(string.Format("Unexpected token {0}, expected {1}", reader.TokenType, tokenType));
        
        public static JsonReader ReadToContentAndAssert(this JsonReader reader) =>
            reader.ReadAndAssert().MoveToContentAndAssert();
    
        public static JsonReader MoveToContentAndAssert(this JsonReader reader)
        {
            ArgumentNullException.ThrowIfNull(reader);
            if (reader.TokenType == JsonToken.None)       // Skip past beginning of stream.
                reader.ReadAndAssert();
            while (reader.TokenType == JsonToken.Comment) // Skip past comments.
                reader.ReadAndAssert();
            return reader;
        }
    
        public static JsonReader ReadAndAssert(this JsonReader reader)
        {
            ArgumentNullException.ThrowIfNull(reader);
            if (!reader.Read())
                throw new JsonReaderException("Unexpected end of JSON stream.");
            return reader;
        }
    }
    

    Demo fiddle #2 here.