Search code examples
joinleft-joinkdbq-lang

How do I perform an efficient left join in kdb?


I want to run a conventional SQL-style left join in KDB+ / Q.

  • We get at least one row in the result for every row in the left table.
  • If there is more than one match in the right table, I get a row for each of those, not just for the first match

Test Data

  x:([];a:1 1 2 3; b:3 4 5 6)

  y:([]; a:1 2 2 4; c:7 8 9 10)

The best version I can think of goes like this:

This appends a left join which provides only the first match to an inner join which provides all matches and then removes the duplicates:

  distinct ej[`a; x; y] , x lj `a xkey y

Can anyone supply me with one that is faster and/or better in some other way? I'd really like to avoid the use of distinct, for example.


Solution

  • Building on @Ryan's answer.

    k)nungroup:{$[#x:0!x;(,/){$[#t:+:x;t;enlist *:'[x];t]}'[x]]}  
    q)nungroup:{$[count x:0!x;(,/){$[count t:flip x;t;enlist first'[x]]}'[x];x]}  
    
    q)nungroup x lj ` \`a xgroup y
    
    a b c 
    ----- 
    1 3 7 
    1 4 7 
    2 5 8 
    2 5 9 
    3 6