This question comes from an example that I'm trying to understand in The Definitive Guide to DAX, Second Edition chapter 4. If you want the sample Power BI file, you can download it from the website above; it's Figure 4-26 in chapter 4. Here is the DAX code:
Correct Average =
VAR CustomersAge =
SUMMARIZE ( -- Existing combinations
Sales, -- that exist in Sales
Sales[CustomerKey], -- of the customer key and
Sales[Customer Age] -- the customer age
)
RETURN
AVERAGEX ( -- Iterate on list of
CustomersAge, -- Customers/age in Sales
Sales[Customer Age] -- and average the customer’s age
)
I understand the logic behind how SUMMARIZE
and AVERAGEX
are used in this example, and the requirements are all clear. What's confusing to me is how AVERAGEX
references Sales[Customer Age]
. Since AVERAGEX
is operating on the summarized CustomersAge
table variable, I would have assumed that the syntax would have been something along the lines of:
AVERAGEX (
CustomersAge,
[Customer Age] -- This is the line that I assumed would be different
)
How is it that the code given in the book is correct? Does the table variable (and the summarized table it contains) somehow have pointers to the original underlying table and column names? And is that normal for writing DAX queries, to always reference the original underlying table and column names when using table variables for intermediate steps?
Yes, the columns have what's known as data lineage. Sometimes you even have to restore lineage if it gets lost. You can read more about it here: https://www.sqlbi.com/articles/understanding-data-lineage-in-dax/