Search code examples
kdbq-lang

Identify N maxes of a row, discarding the remaining bottom values from table rows


I've got stuck on the following problem and was hoping for some help. I've tried a few things and have used some information found on Stack Overflow (such as this/How to apply max function for each row in KDB? and this/Iterate over current row values in kdb query, and flipping then sliding windows as per my previous post), and flicked through my Q for Mortals again but for some reason I have hit a brick wall.

In my table, the first column is a date column, the rest are numbers. From this I'm trying to generate a table that only has the n maximum numbers of a row left, the rest set to zero or 0N (or, if you like, where the m bottom values have been discarded).

Example:

Starting table:

q)t:([] td:2001.01.01 2001.01.02 2001.01.03 2001.01.04 2001.01.05 2001.01.06; 
AA:121.5 125.0 127.0 126.0 129.2 130.0; BB:111.0 115.3 117.0 116.0 119.2
120.0; CC:120.0 126.0 125.5 128.8 135.0 130.0; DD:120.1 123.3 128.4 128.3 
127.5 126.0; NN:122.0 125.5 126.0 116.0 109.0 100.5)

td         AA    BB    CC    DD    NN   
----------------------------------------
2001.01.01 121.5 111   120   120.1 122  
2001.01.02 125   115.3 126   123.3 125.5
2001.01.03 127   117   125.5 128.4 126  
2001.01.04 126   116   128.8 128.3 116  
2001.01.05 129.2 119.2 135   127.5 109  
2001.01.06 130   120   130   126   100.5

The desired end result when identifying the 2 maximums per row and blanking the rest (with either 0 or 0n):

td         AA    BB CC    DD    NN   
-------------------------------------
2001.01.01 121.5                122  
2001.01.02          126         125.5
2001.01.03 127            128.4      
2001.01.04          128.8 128.3      
2001.01.05 129.2    135              
2001.01.06 130      130              

To take row 1 as an example, the top 2 values in AA and NN of that row have been left whilst the two others in BB and CC have been blanked out.

To get only the max value, i.e. the one top value, I can do the below and use the newly added column in a followup update statement. However, the problem here is that I need to find the n maxes and discard the rest.

q)update maxes:max(AA;BB;CC;DD;NN) from t

Not sure if it's of any interest, but as an example of what I have tried: If I use a tip from another stack overflow post and I execute that on the values themselves I can sort of get there, but not in a table format:

q)nthMax:{x (idesc x)[y-1]}
{x (idesc x)[y-1]}
q)nthMax[(121.5 111 120 120.1   122);1]
122f
q)nthMax[(121.5 111 120 120.1   122);2]
121.5

However when I try to use this as part of an update or select then it's not working; also it strikes me as a non-q approach so interested in what folks have to say about solving the above.

Another example was I tried flipping the table then use MMAX, however as the dates are at the top they "survive". Also, this seems a bit clunky maybe as I have to do this for n times per column if I'm interested in n maxes, or drop x numbers that form the bottom values, leaving n max numbers.

Kind regards, Sven


Solution

  • Here's another option, possibly a little tidier:

    q)0!{key[x]#(2#idesc x)#x}'[1!t]
    td         AA    BB CC    DD    NN
    -------------------------------------
    2001.01.01 121.5                122
    2001.01.02          126         125.5
    2001.01.03 127            128.4
    2001.01.04          128.8 128.3
    2001.01.05 129.2    135
    2001.01.06 130      130
    

    This works on the assumption that the first column is the only one you don't want to consider for the maximums. It's similar to the other two answers in it's use of idesc. One part to note here is key[x]# which essentially adds null entries to a dictionary to ensure all keys are present. As an example of this:

    q)`a`b`c#`a`c!1 2
    a| 1
    b|
    c| 2
    

    Note how b is in the resultant dictionary but not in the original dictionary. This is used to make sure the dictionary generated for each line conforms with the others, thus resulting in a table (which after all, is just a list of conforming dictionaries).