Search code examples
excelpowerbipowerquerym

Finding the nearest value in a set of a column


I need to retrieve the nearest matching value in a fixed set values for an entire column.

I can NOT use VBA, and ideally script this in M or a calculated column in powerquery so the result can be replicated in Power BI.

The set of values that have to be matched looks as follows

| preceding column | Sys_size | 
===============================
| ...              |     null |
| ...              |        7 |
| ...              |        9 |
| ...              |       12 |
| ...              |       15 |
| ...              |       17 |
| ...              |     null |

so in short, the list above is variable (more sizes could be added or changed), and contains null (empty) values.

Second there's a bunch of variable numbers as follows

| preceding column | User_size |
================================
| ...              |       8.5 |
| ...              |        13 |
| ...              |         6 |
| ...              |      10.5 |
| ...              |        18 |
| ...              |        14 |

The result I want to obtain in my script looks like this

| preceding column | User_size | Sys_size |
===========================================
| ...              |       8.5 |        9 |
| ...              |        13 |       12 |
| ...              |         6 |        7 |
| ...              |      10.5 |       12 |
| ...              |        18 |       17 |
| ...              |        14 |       15 |

simply put, it searches the nearest Sys_size matching the User_size input. Note that in case the user's value falls exactly between two Sys_size values the result is rounded up.


Solution

  • Here's one way to do it in the query editor:

    First, add a custom column that calculates the minimal distance between User_size and Sys_size for each row of the Users table:

    = Table.AddColumn(PreviousStepNameHere, "Dist",
      (U) => List.Min(List.Transform(List.RemoveNulls(System[Sys_size]),
                                     each Number.Abs(_ - U[User_size]))))
    

    Once you have that, you can pull in Sys_size with a custom column with this formula:

    if List.Contains(System[Sys_size], [User_size] + [Dist])
        then [User_size] + [Dist]
    else if List.Contains(System[Sys_size], [User_size] - [Dist])
        then [User_size] - [Dist]
    else null
    

    (You can probably drop the else if and just use else [User_size] - [Dist] instead.)