For the following data, I can create a Kibana pie chart of terms easily enough, but it only counts the keywords, so 'tests failed' shows as 50%, and 'host not found' and 'compile failed' as 25% each.
{"id":12345, issues:["tests failed", "host not found"]}
{"id":12645, issues:["tests failed"]}
{"id":12643, issues:["compile failed"]}
What I'm after is a representation that preserves the record association - for example, a doughnut chart with an inner ring showing 'compile failed' as 1/3 and 'tests failed' as 2/3, with a 1/3 wedge along its outside for the record that has 'host not found' too. Or a tree-ish Sankey sort of diagram, or stacked histogram, or whatever.
I've looked though the Kibana and Vega documentation, but it seems I'd need to build a tree from this data, which implies looping - to find the most common term, split the records into 'have' and 'have-not' groups and recur for each group for its next most common term until all the groups are segregated - but Vega disclaims any support for loops/recursion, and Kibana... I can't find anything useful
The Vega tree-related transforms don't seem applicable: nest creates a pre-defined heirarchy order based on specific field names, and stratify needs parent/child fields which I could probably generate but without adaptive 'most common term' handling at each node the results would be messy.
I started to look at custom transforms, but before I dive that deep, I thought it would be sensible to ask whether there's already something out there able to do something like this.
Update: since this was my first foray into Vega and I only had time because I was waiting for a code review, my solution is undoubtedly horrible, but I'll link to a Gist and add a few comments about it in case someone finds it helpful. When I get back to its ticket, I'll make easier to deepen (i.e. adding an extra layer should be mostly copy/paste) retarget it to an existing tree visualisation and comment it properly.
I will say, writing in Vega made me feel like my brain was being twisted, much like my University Prolog and LISP courses did! :) I think a proper DSL would be a good idea to make it more readable, perhaps one which can add looping, inter-transform 'viewpoints', and some sort of type system!
Overlapping set values is probably best done with an upset plot. There are a few Vega ones you can reuse.
Regarding looping, you can sometimes get round it by using a Cartesian join if there are not many elements. I used such a technique here: https://github.com/PBI-David/Deneb-Showcase?tab=readme-ov-file#particle-simulation-background