Search code examples
excelpowerquerydata-analysisdata-cleaningm

Excel, Power Query: combine tables, pick only one item


In a continuation of this question: Excel, Power Query: Add multiple columns from query2 into query1

Thanks to Ron I have a code to combine table1:

Table1

And table2:

enter image description here

into this result table:

enter image description here

However, under item in the result table. I don't want to have double data anymore. I want to have only one item of data. In this example were it says 345345, 111111. I only want to have 345345.

This is the code Ron initially made for me to get the table above:

let
    Source = Excel.CurrentWorkbook(){[Name="Overview"]}[Content],
    #"Overview" = Table.TransformColumnTypes(Source,{
        {"NAME.1", type text}, {"NAME.2", type text}, {"NUMBER", Int64.Type}, {"RANDOM", type text}}),
  
//Add index column to be able to go back to original table order
    #"Added Index" = Table.AddIndexColumn(Overview, "Index", 0, 1, Int64.Type),

//Create a Name column that we will expand to all names in a single column
    #"Added Custom" = Table.AddColumn(#"Added Index", "Name", each List.RemoveNulls({[NAME.1]} & {[NAME.2]}), type {text}),
    #"Expanded Name" = Table.ExpandListColumn(#"Added Custom", "Name"),

//Use same technique on Data table
//Be certain the ITEM column is of "type text" if not set that way initially
    #"Added Custom2" =Table.AddColumn(Data,"dataName",each {[NAME.1],[NAME.2]}, type {text}),
    #"Expanded dataName" = Table.ExpandListColumn(#"Added Custom2", "dataName"),

//Extract these fields from the Data table
    #"Data Fields" = List.FirstN(Table.ColumnNames(#"Expanded dataName"),3),

//Merge the tables and return all the columns from the DATA table
    #"Merge Overview/Data" = Table.NestedJoin(#"Expanded Name","Name",#"Expanded dataName","dataName","Join",JoinKind.LeftOuter),

//Extract the desired fields
    #"Add Custom3" = Table.AddColumn(#"Merge Overview/Data", "Fields", each if [Name]=null then null
            else if Table.RowCount([Join]) = 0 then null
            else Record.SelectFields([Join]{0}, #"Data Fields"),type record),
    #"Removed Columns" = Table.RemoveColumns(#"Add Custom3",{"Name", "Join"}),
    #"Expanded Fields" = Table.ExpandRecordColumn(#"Removed Columns", "Fields", {"ITEM", "ITEM 2", "Y/N"}),
    #"Grouped Rows" = Table.Group(#"Expanded Fields", {"Index"}, {{"all", 
        each Table.TransformColumns(_, {
            {"ITEM", (c)=>Text.Combine(_[ITEM], ", ")},
            {"ITEM 2",(c)=>Text.Combine(_[ITEM 2], ", ")}, 
            {"Y/N", (c)=>Text.Combine(_[#"Y/N"], ", ")}}
        ){0}}}),
       
    #"Removed Columns1" = Table.RemoveColumns(#"Grouped Rows",{"Index"}),
    #"Expanded all" = Table.ExpandRecordColumn(#"Removed Columns1", "all", {"NAME.1", "NAME.2", "NUMBER", "RANDOM", "Index", "ITEM", "ITEM 2", "Y/N"}),
    #"Removed Columns2" = Table.RemoveColumns(#"Expanded all",{"Index"}),
    #"Changed Type" = Table.TransformColumnTypes(#"Removed Columns2",{
        {"NAME.1", type text}, {"NAME.2", type text}, {"NUMBER", Int64.Type}, 
        {"RANDOM", type text}, {"ITEM", type text}, {"ITEM 2", type text}, 
        {"Y/N", type text}})
in
    #"Changed Type"

I tried changing this part:

#"Grouped Rows" = Table.Group(#"Expanded Fields", {"Index"}, {{"all", 
    each Table.TransformColumns(_, {
        {"ITEM", (c)=>Text.Combine(_[ITEM], ", ")},
        {"ITEM 2",(c)=>Text.Combine(_[ITEM 2], ", ")}, 
        {"Y/N", (c)=>Text.Combine(_[#"Y/N"], ", ")}}
    ){0}}}),

Into this:

#"Grouped Rows" = Table.Group(#"Expanded Fields", {"Index"}, {{"all", 
    each Table.TransformColumns(_, {
        {"ITEM", (c)=>_[ITEM]{0}},
        {"ITEM 2",(c)=>_[ITEM 2]{0}}, 
        {"Y/N", (c)=>_[#"Y/N"]{0}}}
    ){0}}}),

This partially works. It works when the item value is in the first index. However, when the first index is empty and the second has a value. Then it comes out empty.

I'm trying to make it more clear. For example I get the following combinations:

aaaa, 0 (shows as: aaaa) (0 = no value in this example)
0, bbbb (shows as: bbbb) (0 = no value in this example)
cccc, dddd (shows as: cccc, dddd)

When I use the modification above I will get the following:

aaaa
0 (0 = no value in this example)
cccc

What I need is:

aaaa
bbbb
cccc

What can I do to modify my current code to get what I need?


Solution

  • Typing blind but try this:

    #"Grouped Rows" = Table.Group(#"Expanded Fields", {"Index"}, {{"all", 
        each Table.TransformColumns(_, {
            {"ITEM", (c)=>List.RemoveNulls(_[ITEM]){0}?},
            {"ITEM 2",(c)=>List.RemoveNulls(_[ITEM 2]){0}?}, 
            {"Y/N", (c)=>List.RemoveNulls(_[#"Y/N"]){0}?}}
        ){0}}}),