I have a bunch of data which has the weight of different products with and without lid.
Location | Product | Date of manufacturing | Weight without lid | Weight with lid | Qty |
---|---|---|---|---|---|
US | Food | 04/20/2024 | 3 | 5 | 2 |
US | Food | 04/14/2024 | 72 | 79 | 1 |
US | Food | 04/20/2024 | 79 | 84 | 2 |
US | Food | 04/19/2024 | 82 | 86 | 3 |
US | Drinks | 04/20/2024 | 69 | 76 | 1 |
US | Drinks | 04/14/2024 | 8 | 13 | 5 |
US | Drinks | 04/17/2024 | 51 | 56 | 3 |
US | Drinks | 04/15/2024 | 4 | 6 | 4 |
Australia | Food | 04/19/2024 | 13 | 14 | 2 |
Australia | Food | 04/18/2024 | 8 | 17 | 5 |
Australia | Food | 04/21/2024 | 89 | 97 | 2 |
Australia | Food | 04/19/2024 | 5 | 13 | 4 |
Australia | Stationery | 04/20/2024 | 63 | 67 | 4 |
Australia | Stationery | 04/19/2024 | 74 | 76 | 6 |
Australia | Stationery | 04/20/2024 | 72 | 80 | 5 |
Australia | Stationery | 04/12/2024 | 32 | 34 | 3 |
What I'm trying to plot is the frequency distribution of the weights. On x-axis, I have weight ranging from 0 to 100 and y-axis is my frequency distribution. If, for example, I have a product whose weight with and without product is 72 and 79 respectively, the y-axis should be 1 for all weight on the x-axis between 72 and 79.
I'm expecting that I can plot a graph with data similar to below.
Weight | Distribution |
---|---|
80 | 7 |
81 | 2 |
82 | 5 |
83 | 5 |
84 | 5 |
85 | 3 |
86 | 3 |
87 | |
88 | |
89 | 2 |
90 | 2 |
Any help would be greatly appreciated. Thanks.
Update #03: 10-05-2024
Given the new requirement to add the total quantity instead of counting the products, the new measure should be:
FrequencyDist_02 =
VAR CurrentWeight = SELECTEDVALUE(FP_RANGE[Parameter])
VAR _filteredTable = FILTER(
'Table',
[Weight without lid] <= CurrentWeight &&
[Weight with lid] >= CurrentWeight
)
VAR totalQuantity = SUMX(
_filteredTable,
[Quantity]
)
VAR _result = IF(ISEMPTY(_filteredTable), 0, totalQuantity)
RETURN _result
Update #02: 08-05-2024 After hearing the new requirements, you could work with a disconnected table that generates the values between 0 and 100, and an additional measure that iterates over each value between 0 and 100, filters the table to get all rows that have that current value between 'Weight without lid' and 'Weight with lid', and return the count of that.
Disconnected table:
FP_RANGE = GENERATESERIES(0, 100, 1)
Measure
FrequencyDist =
VAR CurrentWeight = SELECTEDVALUE(FP_RANGE[Parameter])
VAR _filteredTable = FILTER(
SampleTable,
[Weight without lid] <= CurrentWeight &&
[Weight with lid] >= CurrentWeight
)
VAR counter = IF(ISEMPTY(_filteredTable), 0, COUNTROWS(_filteredTable))
RETURN counter
CurrentWeight
will select the current selected value of the 0-100 series. In a table visual, it will just go row-by-row, so 0, 1, etc.Illustration of filtered table with value "13"
Frequency distributions are not that easy in Power BI when you have gaps and the amount of sample data is small. Perhaps you could go for a third-party way using R or Python scripting that plots a density distribution (graph on the left) or histogram (graph on the right) for any specific product filter? When a particular product filter is selected, the graph gets recalculated.
Using the R visual, you need to include the values you would like to do calculations with. In this case, I have added the 'Weight with lid' and 'Weight without lid' columns. Afterwards, you need to create the R code that would output a plot in R (which is then shown in Power BI). I have used the following code:
# The following code to create a dataframe and remove duplicated rows is always executed and acts as a preamble for your script:
# dataset <- data.frame(Weight with lid, Weight without lid)
# dataset <- unique(dataset)
# Paste or type your script code here:
dataset <- unlist(dataset, use.names=FALSE)
plot(density(dataset)) # OR use histogram(dataset)
At the same time, it also fits your requirement to have a dynamic X-axis, but this is also customisable inside R.