I'm new in KDB and struggling with creating a query. Will appreciate any help.
I have a table of strings and need to get the count of all specific substrings across all strings in the table.
So, let's assume that I have strings:
[
string1: Apple is green, cherry is red,
string2: Ququmber is green, banana is yellow
]
and I want to get a count of "Apple" and "green" across all substrings. My desired result is to have a grouping like so:
{
Apple: 1,
green: 2
}
But, unfortunately, I have no idea how to make such a grouping. I have already figured out how to get strings that contain at least one of the needed substrings:
"select count(text) from data where any text like/: (\"*$Apple*\";\"*$green*\")"
but that returns me the cumulative result of all found strings for Apple and green without any grouping:
{
text: 3
}
which does not allow differentiating the amount of each particular substring.
I will be really thankful for any help.
Instead of using a where clause with an any
, you can put the like/:
in the select phrase to get a nested list of booleans where each list represents the matches for one search string. Then you can just sum
these to get the total matches for each search string. I've used an exec
here rather than a select
as I suspect that output will be more useful:
q)t:([] text:("Apple is green, cherry is red,";"Ququmber is green, banana is yellow"))
q)exec sum each text like/:("*Apple*";"*green*") from t
1 2i