Search code examples
excelexcel-formulagroupingpowerquery

MS Power Query - eliminating duplicates in a set of delimited values


I have a excel file with data something like below:

enter image description here

A1 | 13,15,14,16,17

A2 | 13,16

I want the data to look like

A3 | 13, 15, 16, 17

Merge the selected cells data into single cell, avoid duplicate values.

Also, while doing this compare the values to other cells and then exclude those cells.

Say, B1 | 13,14

Then Final Output should not include values from B1

A3 | 15, 16, 17

I tried using Text Join, Unique functions but none of them are working. I found out about XMLconvert function but that is also not supporting excel.


Solution

  • Since OP has tagged Excel Formula then one could try the following :

    enter image description here


    • Formula used in cell A3

    =TEXTJOIN(",",,UNIQUE(TOCOL(TEXTSPLIT(TEXTAFTER(","&A1:A2,",",SEQUENCE(,MAX(LEN(A1:A2)-LEN(SUBSTITUTE(A1:A2,",",))+1))),","),3)))
    

    Or,

    =ARRAYTOTEXT(DROP(UNIQUE(REDUCE("",A1:A2,LAMBDA(x,y,VSTACK(x,TEXTSPLIT(y,,","))))),1))
    

    With Power Query, you can try using the following M Code:

    enter image description here


    let
        Source = Excel.CurrentWorkbook(){[Name="DelimitedTab"]}[Content],
        #"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(Source, {{"Column1", Splitter.SplitTextByDelimiter(",", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Column1"),
        #"Removed Duplicates" = Table.Distinct(#"Split Column by Delimiter"),
        Column1 = Text.Combine(#"Removed Duplicates"[Column1],",")
    in
        Column1
    

    Or, if the data is limited then one could use the following as well:

    enter image description here


    =ARRAYTOTEXT(UNIQUE(TEXTSPLIT(TEXTJOIN(",",,A1:A2),,",")))
    

    Edit:

    As per OP's comment:

    Also, while doing this compare the values to other cells and then exclude those cells. Say, B1 | 13,14 Then Final Output should not include values from B1 A3 | 15, 16, 17


    enter image description here


    • Formula used in cell A4

    =LET(
         α, SORT(UNIQUE(TEXTSPLIT(TEXTJOIN(",",,A1:A2),,","))), 
         TEXTJOIN(",",1,IF(ISERROR(XMATCH(α, TEXTSPLIT(B1,","))),α,"")))
    

    Or,

    • Formula used in cell A5

    =LET(
         α, SORT(UNIQUE(TEXTSPLIT(TEXTJOIN(",",,A1:A2),,","))), 
         ARRAYTOTEXT(FILTER(α, ISNA(XMATCH(α,TEXTSPLIT(B1,","))))))
    

    Or,

    • Formula used in cell A6

    =LET(
         α,SORT(UNIQUE(--TEXTSPLIT(TEXTJOIN(",",,A1:A2),,","))),
         TEXTJOIN(",",,REPT(α,ISERR(SEARCH(α,B1)))))
    

    • Formula used in cell A7

    =LET(α,UNIQUE(SORT(--TEXTSPLIT(A1&","&A2,,","))),ARRAYTOTEXT(TOCOL(α/ISERR(SEARCH(α,B1)),3)))
    

    Test cases:

    enter image description here


    enter image description here


    enter image description here


    enter image description here


    enter image description here