I have item sales data from a number of regions. For each region (North, West etc), there can be multiple sales items for each day. I would like to remove rows for the first (MIN) and last (MAX) of the DATE column for each REGION. In the dataset below, the 2 rows for the 'Fri 09 Dec 22 and the 2 rows for 'Mon 12 Dec 22' would be removed for the 'North' region. All rows between the MIN and MAX of the date column for that region should be kept. Again for the next region 'South', the rows for 'Tue 06 Dec 22' and 'Mon 12 Dec 22' should be removed and all rows between those 2 dates kept etc....
Region | Date | Product | Amount |
---|---|---|---|
North | Fri 09 Dec 22 | Potato | $75 |
North | Fri 09 Dec 22 | Potato | $59 |
North | Sat 10 Dec 22 | Potato | $131 |
North | Sat 10 Dec 22 | Apple | $14 |
North | Sun 11 Dec 22 | Tomato | $88 |
North | Sun 11 Dec 22 | Tomato | $48 |
North | Mon 12 Dec 22 | Carrot | $26 |
North | Mon 12 Dec 22 | Carrot | $13 |
South | Tue 06 Dec 22 | Banana | $69 |
South | Tue 06 Dec 22 | Banana | $18 |
South | Tue 06 Dec 22 | Tomato | $11 |
South | Thu 08 Dec 22 | Pineapple | $8 |
South | Thu 08 Dec 22 | Pineapple | $18 |
South | Thu 08 Dec 22 | Apple | $115 |
South | Sun 11 Dec 22 | Pineapple | $32 |
South | Sun 11 Dec 22 | Pineapple | $29 |
South | Mon 12 Dec 22 | Carrot | $40 |
West | Wed 07 Dec 22 | Plum | $89 |
West | Wed 07 Dec 22 | Plum | $144 |
West | Wed 07 Dec 22 | Apple | $19 |
West | Thu 08 Dec 22 | Orange | $33 |
West | Thu 08 Dec 22 | Orange | $109 |
West | Fri 09 Dec 22 | Orange | $61 |
West | Fri 09 Dec 22 | Orange | $14 |
West | Fri 09 Dec 22 | Orange | $89 |
West | Fri 09 Dec 22 | Apple | $103 |
West | Fri 09 Dec 22 | Orange | $2 |
West | Fri 09 Dec 22 | Orange | $87 |
West | Sat 10 Dec 22 | Potato | $106 |
West | Sat 10 Dec 22 | Apple | $1 |
West | Sun 11 Dec 22 | Tomato | $75 |
West | Sun 11 Dec 22 | Tomato | $19 |
West | Sun 11 Dec 22 | Tomato | $114 |
West | Sun 11 Dec 22 | Apple | $61 |
West | Mon 12 Dec 22 | Apple | $37 |
West | Mon 12 Dec 22 | Carrot | $15 |
West | Mon 12 Dec 22 | Carrot | $16 |
I'm able to remove the MIN and MAX values for the whole date range, but not for each REGION. If I create a grouping and flag the MIN and MAX for each region, I can get the right rows but the Product and Amount columns become lists - which if expanded to values create many duplicate rows:
let
Source = Excel.Workbook(File.Contents("C:\Temp\Test Data\2022 Sales.xlsx"), null, true),
Test_Sheet = Source{[Item="Test",Kind="Sheet"]}[Data],
#"Promoted Headers" = Table.PromoteHeaders(Test_Sheet, [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Region", type text}, {"Date", type date}, {"Time", type datetime}, {"Product", type text}, {"Amount", Int64.Type}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Region"}, {
{"Date", each [Date]},
{"Product", each [Product]},
{"Amount", each [Amount]},
{"DontKeepMinFlag", each Table.First(_)[Date]},
{"DontKeepMaxFlag", each Table.Last(_)[Date]}}),
#"Expanded Date" = Table.ExpandListColumn(#"Grouped Rows", "Date"),
#"Added Conditional Column" = Table.AddColumn(#"Expanded Date", "Custom", each if [DontKeepMinFlag] = [Date] then true else if [DontKeepMaxFlag] = [Date] then true else false),
#"Filtered Rows" = Table.SelectRows(#"Added Conditional Column", each ([Custom] = false)),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"DontKeepMinFlag", "DontKeepMaxFlag", "Custom"})
in
#"Removed Columns"
Thank you for providing a usable text example.
You can do all the selecting in your Table.Group Aggregations, and then just expand the column of tables (except for the Region Column).
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("jdTJbsMgEAbgV0FWjzkwgG187KLeukiJlEOUA2pREikxEbXfv5imwYMaOvLFSJ8G5mfZbKpX54d9taie/YHxjj3ZDyZEGL+7wQwu/Ny1dbVdUGDdIbg0AwP+FwQJRXl/Ph9thAq7sWcAya3c6VJRayJUGL64AEWCj8Z7N0xQNEQIMsKlGyNcjZbxJsEH04dvgk1HhKCLMDUDgOF+ZFzPAj/01vxGqek0nz+zaXegRjBPHRWVgm4FTup29opHuLZf03BtPxlvZzWP4ym23lEUKFViqWtULQ/nzZt+99OwpDngqGB+vRJsgOZA0Zwuzpv65ZJWTxCnbeeu8EjwpgTT+pC6efEvr9i/Dm9vweGYc3hdHt61/ChfmWxLbPba1EQXwtt+Aw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Region = _t, Date = _t, Product = _t, Amount = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Region", type text}, {"Date", type date}, {"Product", type text}, {"Amount", Currency.Type}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Region"}, {
{"Trimmed", (t)=>Table.SelectRows(t,
each [Date] <> List.Min(t[Date]) and [Date] <> List.Max(t[Date])),
type table[Region=nullable text, Date=nullable date, Product=nullable text, Amount=nullable number]}}),
#"Expanded Trimmed" = Table.ExpandTableColumn(#"Grouped Rows", "Trimmed", {"Date", "Product", "Amount"})
in
#"Expanded Trimmed"