Search code examples
joindaxrelationshippowerquerylookup

Table.SelectRows when one column concatenated and the other is split into rows


I want to create a custom column using Table.SelectRows (or any other expression) that does the following. In my Table1, I have a column called Cities that lists values separated by a comma:

Country Cities
France Paris, Lyon, Marseille
Germany Berlin, Munich, Dortmund

In my Table2, I have a Sister_Cities column like this:

City Sister_Cities
Paris Rome
Lyon Birmingham
Lyon Guangzhou
Lyon Montreal
Lyon Addis Ababa
Marseille Genoa
Marseille Glasgow

I know the logical thing to do would be to split the first table into rows and then do a join based on the Cities and City columns, but I want to keep them concatenated so that I have a new column in Table1 that looks like this:

Country Cities Custom.Sister_Cities
France Paris, Lyon, Marseille Rome, Birmingham, Guangzhou, Montreal, Addis Ababa, Genoa, Glasgow

What's the best way to achieve this? Appreciate any advice!


Solution

  • I would probably still do it that way

    let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Split Column by Delimiter" = Table.ExpandListColumn(Table.TransformColumns(Source, {{"Cities", Splitter.SplitTextByDelimiter(", ", QuoteStyle.Csv), let itemType = (type nullable text) meta [Serialized.Text = true] in type {itemType}}}), "Cities"),
    #"Merged Queries" = Table.NestedJoin(#"Split Column by Delimiter", {"Cities"}, Table2, {"City"}, "Table2", JoinKind.LeftOuter),
    #"Expanded Table2" = Table.ExpandTableColumn(#"Merged Queries", "Table2", {"Sister_Cities"}, {"Sister_Cities"}),
    #"Grouped Rows" = Table.Group(#"Expanded Table2", {"Country"}, {
        {"Cities", each Text.Combine(List.Transform(List.Distinct([Cities]), Text.From), ", "), type text},
        {"Sister_Cities", each Text.Combine(List.Transform([Sister_Cities], Text.From), ", "), type text}
        })      
    in  #"Grouped Rows"
    

    enter image description here

    alternate, probably slower

    let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    Sisters= Table.Buffer(Table.Group(Table2, {"City"}, {{"Sister_Cities", each Text.Combine(List.Transform([Sister_Cities], Text.From), ", "), type text} })),
    #"Added Custom" = Table.AddColumn(Source, "Custom", each List.Transform(Text.Split([Cities],", "), (x)=> try Table.SelectRows(Sisters, each [City] = x)[Sister_Cities]{0} otherwise null)),
    #"Added Custom1" = Table.AddColumn(#"Added Custom", "Sister_Cities", each Text.Combine(List.Transform(List.RemoveNulls([Custom]), Text.From), ",")),
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom1",{"Custom"})
    in  #"Removed Columns"