I am attempting to load several Excel files into Power BI. These files are pretty small (<= ~1k rows). One of these sources must be cleaned up. In particular, one of its columns has some bad data. The correct data is stored in another Excel file. For example:
table bad:
ID col1
1 0
2 0.5
3 2
4 -3
table correct:
ID colx
2 1
4 5
desired output:
ID col1
1 0
2 1
3 2
4 5
In SQL or other data visualization tools, I would left join the bad table to the clean table and then coalesce the bad values and correct values. I know that I have some options on how to implement this in Power BI. I think one option is to implement it in the query editor (i.e., M). I think another option is to implement it in the data model (i.e., DAX). Which option is best? And, what would the implementation look like (e.g., if M, then what does the query look like)?
While you can do this in DAX, I'd suggest doing it in the query editor. The steps would look roughly like this:
Correct
table into the Bad
table using a left outer join on the ID columns
.Correct
table to just get the Colx
column.Create a custom column to pick the values you want. (Add Column > Custom Column)
if [Colx] = null then [Col1] else [Colx]
Col1
and Colx
column if you want or just keep them. If you delete Col1
, you can rename the Col2
column to be Col1
.If you don't want the source tables floating around, you can do all of the above in a single query similar to this:
let
BadSource = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTJQitWJVjICsfRMwWxjINsIzDIBsnSNlWJjAQ==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, Col1 = _t]),
CorrectSource = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlLSUTJUitWJVjIBskyVYmMB", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [ID = _t, Colx = _t]),
Bad = Table.TransformColumnTypes(BadSource,{{"ID", Int64.Type}, {"Col1", type number}}),
Correct = Table.TransformColumnTypes(CorrectSource,{{"ID", Int64.Type}, {"Colx", type number}}),
#"Merged Queries" = Table.NestedJoin(Bad,{"ID"},Correct,{"ID"},"Correct",JoinKind.LeftOuter),
#"Expanded Correct" = Table.ExpandTableColumn(#"Merged Queries", "Correct", {"Colx"}, {"Colx"}),
#"Added Custom" = Table.AddColumn(#"Expanded Correct", "Col2", each if [Colx] = null then [Col1] else [Colx]),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Col1", "Colx"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Columns",{{"Col2", "Col1"}})
in
#"Renamed Columns"