Search code examples
stringloopspowerbiuniquedax

POWERBI, DAX : concatenate strings, split, and keep substrings only once


I try to do the following:

  • I have a column with strings, each can have several substrings separated by a delimiter ":"
  • I need to Concatenate the column strings (I do a Filter here to keep only interresting lines)
  • Then split according to the delimiter ":"
  • keep the substrings only once, if they are repeated.

Example:

ColumnHeader
AA:BB:CC
BB:DD
DD:AA:EE
EE:AA:DD:BB
BB:EE
...

Expected result would be a unique string:

"AA:BB:CC:DD:EE"

How would you do this in DAX to fill a new column ?

I expected to find for/while loops in DAX like in Python ... but failed.

I Tried this:

List =
VAR SIn = ""
VAR SOut = ""
VAR Cursor = 0
VAR SList =
        CONCATENATEX(
            FILTER(ATable, ATable[Name] = CTable[Name]),
            [ColumnHeader],
            ":")
VAR pos1 = FIND(":", SList, Cursor, len(SList))
VAR pos2 = FIND(":", SList, pos1, len(SList))
VAR elem = TRIM(MID(SList, pos1+1, pos2-pos1))

// following is not good but is what I would like to do:
VAR SOut = CONCATENATE(SOut, elem)
VAR SList = MID(SList, pos2, len(SList)-pos2)
VAR Cursor = pos2

// I need to loop ... but how ? ... as no for/while loops are possibles ?

Thanks for your help.

=====================================

I manage to tackle this thanks to the answers below.

I will still give a bigger data set for a better understanding of the global problem:

I have 2 tables:

TABLE_BY_ELEMENT            
KEY GROUP   LIST        KEY_DATA
1   G1      AA:BB:FF    11
2   G1      CC:AA       22
3   G1      FF:DD:AA    33
4   G1      CC:DD:AA    44
5   G2      CC:FF:GG    55
6   G2      BB:AA       66
            
            
TABLE_BY_GROUP          
GROUP   GROUP_DATA      
G1      1111        
G2      2222        

And I want to view the data like this:

RESULT_BY_GROUP     
GROUP   GROUP_DATA  NewList
G1      111         AA:BB:FF:CC:DD
G2      222         CC:FF:GG:BB:AA

and also:

RESULT_ELEMENT      
KEY LIST        KEY_DATA
1   AA:BB:FF    11
2   CC:AA       22
3   FF:DD:AA    33
4   CC:DD:AA    44
5   CC:FF:GG    55
6   BB:AA       66

I hope is is easier to understand with this.


Solution

  • This isn't something DAX is suited for well. If you need to use DAX to make it into a dynamic measure, then you'll probably need to reshape your data to be more usable. For example,

    ID   ColumnHeader
    1    AA
    1    BB
    1    CC
    2    BB
    2    DD
    3    DD
    3    AA
    3    EE
    ...
    

    You can do this split in the query editor using the Split Column > By Delimiter tool and choosing to split on the colon and expand into rows.

    Split Column

    Once it's in this more usable format, you can work with it in DAX like this:

    List = CONCATENATEX( VALUES('Table'[ColumnHeader]), 'Table'[ColumnHeader], ":" )
    

    Borrowing logic from here, it's possible to do this purely in DAX, but I don't recommend this route.

    List =
    VAR LongString =
        CONCATENATEX ( VALUES ( 'Table1'[ColumnHeader] ), Table1[ColumnHeader], ":" )
    VAR StringToPath =
        SUBSTITUTE ( LongString, ":", "|" )
    VAR PathToTable =
        ADDCOLUMNS (
            GENERATESERIES ( 1, LEN ( StringToPath ) ),
            "Item", PATHITEM ( StringToPath, [Value] )
        )
    VAR GroupItems =
        FILTER (
            SUMMARIZE ( PathToTable, [Item] ),
            NOT ISBLANK ( [Item] )
        )
    RETURN
        CONCATENATEX ( GroupItems, [Item], ":" )