Search code examples
powerbidaxpowerquerypowerbi-desktopm

How to Summarize and Calculate Proportions in Power BI using DAX/M CODE for WIDE/Horizontal format data


I hope you're all doing well. I'm currently working on a Power BI project and I'm facing a challenge with transforming and summarizing the data in wide/horizontal format, I have a dataset structured as follows:

critical_level id_trans agent period type_a type_b type_c
1 a1 mark may 0 0 1
1 a2 mark may 1 1 1
0.5 a3 mark may 1 0 0
1 a4 sam may 1 0 1
0 a5 sam may 1 1 0
1 a6 sam may 1 0 0

please keep in mind that the cols type_a, type_b and type_c are calculated columns created from the raw data.

This is how they are calculated, I have the same three cols in my raw source data but instead of a number I have a categorical value with the mark "Present" or "Not Present" so I transform them into 0 and 1 (1 if present).

Here's a breakdown of the steps I'm trying to achieve:

1 Filter the dataset based on "critical_level" greater than zero.

2 Sum the "type_a," "type_b," and "type_c" columns for each group (agent and period).

3 Create a "subtotal" column as the count of each observation per group.

4 Calculate the proportions of "type_a," "type_b," and "type_c" over the "subtotal."

My goal is to transform this dataset into a summary table grouped by "agent" and "period." Here's the desired output:

agent period type_a type_b type_c subtotal
mark may 0.67 0.33 0.67 3
sam may 1.00 0.00 0.50 2

Solution

  • You're best doing this type of reshaping in Power Query.

    Start:

    enter image description here

    Filter out zero:

    enter image description here

    Create a group by as follows:

    enter image description here

    enter image description here

    In the formula bar for the last step, change it to the following:

    = Table.Group(#"Filtered Rows", {"agent ", "period "}, {{"type_a", each List.Sum([#"type_a "])/Table.RowCount(_), type nullable number}, {"type_b", each List.Sum([#"type_b "])/Table.RowCount(_), type nullable number}, {"type_c", each List.Sum([type_c])/Table.RowCount(_), type nullable number}, {"subtotal", each Table.RowCount(_), Int64.Type}})
    

    enter image description here

    Full code:

    let
        Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
        #"Changed Type" = Table.TransformColumnTypes(Source,{{"critical_level ", type number}, {"id_trans ", type text}, {"agent ", type text}, {"period ", type text}, {"type_a ", Int64.Type}, {"type_b ", Int64.Type}, {"type_c", Int64.Type}}),
        #"Filtered Rows" = Table.SelectRows(#"Changed Type", each ([#"critical_level "] <> 0)),
        #"Grouped Rows" = Table.Group(#"Filtered Rows", {"agent ", "period "}, {{"type_a", each List.Sum([#"type_a "])/Table.RowCount(_), type nullable number}, {"type_b", each List.Sum([#"type_b "])/Table.RowCount(_), type nullable number}, {"type_c", each List.Sum([type_c])/Table.RowCount(_), type nullable number}, {"subtotal", each Table.RowCount(_), Int64.Type}})
    in
        #"Grouped Rows"