Search code examples
if-statementpowerbiincrementpowerquerym

Power Query: how to add one to a column when a specific values appear in an other column


I have an ID column and I am looking for ways to increment my IDs each time a specific item appears in my Geography column (ItalyZ, ItalyM, UKY or UKM) is found.

The ID of ItalyZ starts at 0 and ends at 4000.

The ID of ItalyB starts at 4000 and ends at 8000.

The ID of UKY starts at 0 and ends at 4000.

The ID of UKM starts at 4000 and ends at 8000.

However, I am refreshing my file, and I will thus have from time to time new arrivals of "geographies" without the origins or first IDs. These boundaries/ranges are only known beginning and ends.

Here is a sample of my data:

  |---------------------|------------------|    
  |       ID            |   Geography      |
  |---------------------|------------------|
  |    AB0000           |      ItalyZ      |
  |---------------------|------------------|
  |    AB4041           |      ItalyB      |
  |---------------------|------------------|
  |    BC0000           |      UKY         |
  |---------------------|------------------|
  |    BC4001           |      UKM         |
  |---------------------|------------------|
  |    NULL             |      ItalyZ      |
  |---------------------|------------------|
  |    NULL             |      ItalyZ      |
  |---------------------|------------------|
  |    NULL             |      UKY         |
  |---------------------|------------------|
  |    NULL             |      UKM         |
  |---------------------|------------------|  

Here is my expected output :

  |---------------------|------------------|    
  |       ID            |   Geography      |
  |---------------------|------------------|
  |    AB0000           |      ItalyZ      |
  |---------------------|------------------|
  |    AB4041           |      ItalyB      |
  |---------------------|------------------|
  |    BC0000           |      UKY         |
  |---------------------|------------------|
  |    BC4001           |      UKM         |
  |---------------------|------------------|
  |    AB0001           |      ItalyZ      |
  |---------------------|------------------|
  |    AB0001           |      ItalyZ      |
  |---------------------|------------------|
  |    AB4042           |      UKY         |
  |---------------------|------------------|
  |    BC0001           |      UKM         |
  |---------------------|------------------|  

I have been trying many various ways and trying to adapt running total solutions. I have also been trying to break apart my file in four different ones in order not to have an If function alternating between cases, and thus making it simpler, like this in my power query:

 #"Added Custom2" = Table.AddColumn(#"Reordered Columns", "Sum", each if [Geography] = "UKM" then [Number AB range below 4000] + 1 
else if [Geography] = "UKY" then [Number AB range above 4000] + 1 
else if [Geography] = "ItalyB" then [Number BC range above 5000]
else [Number BC range below 5000] + 1)

But absolutely nothing works. This maddening.


Solution

  • As in my other answer, here's a simplified problem ignoring the ID letter prefixes you have.

    ID,  Group | NewID
    -----------|------
    4,     A   | 4
    7,     A   | 7
    300,   B   | 300
    525,   C   | 525
    null,  A   | 10
    9,     A   | 9
    null,  A   | 11
    null,  B   | 301
    null,  C   | 526
    null,  A   | 12
    null,  B   | 302
    

    Starting from the left part of the table, we want to compute the new column NewID.

    In this answer, I will write a custom function that's written recursively using the List.Generate function.

    From the linked documentation the function is set up like this

    List.Generate(
        initial as function,                    /*Set all your initial variables*/
        condition as function,                  /*Stopping criteria.*/
        next as function,                       /*Define how to update at each step.*/
        optional selector as nullable function  /*Pick output element.*/
    ) as list
    

    Define a function that takes a column potentially containing nulls and fills the nulls incrementally from the maximal non-null value:

    (Column as list) as list =>
    let
        Generate =
        List.Generate(
            () => [x = Column{0}, i = 0, n = List.Max(Column)],
            each [i] < List.Count(Column),
            each [
                i = [i] + 1,
                x = if Column{i} = null then [n] + 1 else Column{i},
                n = if Column{i} = null then [n] + 1 else [n]
                ],     
            each [x]
        )
    in
        Generate
    

    When you define the function, it looks like this and can be re-used in any other queries:

    Screenshot

    You can use it by choosing an existing column in an existing table and clicking the Invoke button.

    Select Column

    This will create a new list in your Queries pane named Invoked Function that is that function applied to the column you selected.

    Invoked Function

    You can also create a blank query and pass it a list. For example, FilterNulls({4,7,null,9,null,null}) returns {4,7,10,9,11,12}.

    Here's what that looks like in the query editor.

    New Query


    What we really want to do is use this function as a column transformation in a group by operation and then expand:

    let
        Source = <Data Table Source Here>,
        #"Grouped Rows" = Table.Group(Source, {"Group"}, {{"FillNulls", each FillNulls([ID]), type list}}),
        #"Expanded FillNulls" = Table.ExpandListColumn(#"Grouped Rows", "FillNulls")
    in
        #"Expanded FillNulls"
    

    Here's what it looks like after grouping but before expanding:

    Group By

    Notice what the function is doing. We're applying the function FillNulls on the ID column for each separate Group.


    This is a similar amount of steps and complexity as the other answer but uses a function that is constructed in a recursive way that may be more familiar to you.