Search code examples
powerbipowerquerym

Power Query join on the least difference


How to make a Power Query join of two tables on least difference between columns. I mean absolute difference between numbers.

enter image description here

I followed this great article: https://exceed.hr/blog/merging-with-date-range-using-power-query/

I tried adding this custom column analogously, where L stands for Tab1 and R for Tab2:

= Table.AddColumn(
    Source, 
    "LeastAbsDifference", 
    (L) =>
      Table.SelectRows( Tab2, 
         (R) => L[category] = R[category] and Number.Abs(L[target] - R[actual]) )
  )

It produces error: Expression.Error: We cannot convert the value 4 to type Logical.


Tables to recreate example:

// Tab1
let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUTJVitWJVnKCs5whrFgA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [category = _t, target = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"target", Int64.Type}})
in
    #"Changed Type"
// Tab2
let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUTJUitWBsIzgLGMwywnIMoGzTOEsM6XYWAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [category = _t, actual = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"actual", Int64.Type}})
in
    #"Changed Type"

Solution

  • I would like to acknowledge a favor of Kristian Rados, who provided the answer to my question in the comments of his article: Merging with date range using Power Query With my gratitude, and by courtesy of the author, I am quoting the answer in full:


    The reason your formula produces an error is the second argument of the Table.SelectRows function. In it, you need to filter the table with the boolean (true/false) expression. In your case, the part of the code with Number.Abs function returns a number instead of true/false (e.g. L[target] – R[actual] = 5-1=4 ). Trying to filter the table this way could be possible, but it would require you to use multiple nested environments, which would result in a very complicated formula and slow performance.

    I would suggest trying a different approach. By using your example from stack overflow I reproduced the problem. Below is a complete M code I came up with along with the explanation below:

    let
        Source = Tab1,
        #"Merged Queries" = Table.NestedJoin(Source, {"category"}, Tab2, {"category"}, "Tab2", JoinKind.LeftOuter),
        #"Expanded Tab2" = Table.ExpandTableColumn(#"Merged Queries", "Tab2", {"actual"}, {"actual"}),
        #"Inserted Subtraction" = Table.AddColumn(#"Expanded Tab2", "Least difference", each Number.Abs([target] - [actual]), Int64.Type),
        #"Grouped Rows" = Table.Group(#"Inserted Subtraction", {"category"}, {{"All", each Table.First(Table.Sort(_, {{"Least difference", Order.Ascending}}))}}),
        #"Expanded All" = Table.ExpandRecordColumn(#"Grouped Rows", "All", {"target", "actual", "Least difference"}, {"target", "actual", "Least difference"})
    in
        #"Expanded All"
    

    First, we merged the queries by using the category column. After expanding the table, we subtracted two columns to get the absolute difference between target and actual. Finally, we group by category and sort the table by the Least difference column in ascending order (Table.Sort function inside the grouped rows). After this, we take the first row of the nested table (Table.First function), and finally expand the record column.