Search code examples
powerbipowerquerypowerbi-desktopm

Power Query - Remove Consecutive Duplicates Only


I have a dataset (Airport Location Codes) e.g. London (LHR), Paris (CDG), Rome (FCO) etc on each row.

On some rows the entry might be London, London, London, Paris. which I planned to remove by clearing duplicates however on some instances I have genuine duplicates e.g. London, Paris London. Where I want to retain the final both London's because there was another location in between. If we consider Airport codes, the route might look like this -

LHR, LHR, CDG, FCO, FCO (Change to LHR-CDG-FCO) -

LHR, LHR, CDG, CDG, CDG, LHR (Change to LHR-CDG-LHR) What I was able to Achieve is Distinct List where it just returned LHR-CDG.

The number of times a code could appear is dynamic.

Is there function available in PowerQuery that could return this type of logic?

Sample data:

ID Route Required Result
1 LHR, LHR, CDG,CDG, FCO LHR-CDG-FCO
2 AMS, AMS, LHR, CDG AMS-LHR-CDG
3 AMS, AMS, LHR, AMS, AMS AMS-LHR-AMS
4 CGN-CGN-AMS-AMS-AMS-CDG-CDG-AMS CGN-AMS-CDG-AMS
5 CGN-CGN-CDG-CDG-CDG CGN-CDG

Solution

  • You can do this by using List.Accumulate and List.Last, paste this formula into a New Custom Column:

    Text.Combine(
        List.Transform(
            List.Accumulate(
                Text.SplitAny([Route], ",-") /* Will split given column on any of the listed delimiters in 2nd argument */
                , {}, (state,current) => 
                if List.Last(state) <> Text.Trim(current) 
                then List.Combine({state, {Text.Trim(current)}}) 
                else state 
             ), 
            Text.From
        ), "-" /* Delimiter for output column */
    )
    

    This function does the following:

    • Splits your text column into a list using the separators given in List.SplitAny
    • Using an empty list as a baseline in List.Accumulate, it loops over your split column list and checks how the last element of the list compares to the current (trimmed, to avoid leading/trailing spaces) loop value
    • If the current loop value is the same as the last element of the list, the element is skipped
    • If the current loop value is NOT the same as the last element, we add it to the state using List.Combine
    • Finally, after the loop has finished, we use Text.Combine with a set delimiter to combine the result from List.Accumulate into a text string

    While it is an advanced function, I am quite happy with it!