I have SQL results that I need to break into item ranges and the count distributed evenly across a number of tasks. What is a good way to do this?
My data looks like this.
+------+-------+----------+
| Item | Count | ItmGroup |
+------+-------+----------+
| 1A | 100 | 1 |
| 1B | 25 | 1 |
| 1C | 2 | 1 |
| 1D | 6 | 1 |
| 2A | 88 | 2 |
| 2B | 10 | 2 |
| 2C | 122 | 2 |
| 2D | 12 | 2 |
| 3A | 4 | 3 |
| 3B | 103 | 3 |
| 3C | 1 | 3 |
| 3D | 22 | 3 |
| 4A | 55 | 4 |
| 4B | 42 | 4 |
| 4C | 100 | 4 |
| 4D | 1 | 4 |
+------+-------+----------+
Item = the item code. Count = this context it is determining the popularity of the item. This can be used to RANK items if need be. ItmGroup - this is a parent value for the Itm column. Item is contained in a Group.
What differentiates this from other similar questions I'veviewed is that the ranges I need to determine cannot be taken out of the order they show in this table. We can do Item Range from A1 to B3, in other words, they can cross over ItmGroups, but they must remain in alphanumeric order by Item.
The expected result would be item ranges that evenly distribute the total count.
+------+-------+----------+
| FrItem | ToItem | TotCount|
+------+-------+----------+
| 1A | 2D | 134 |
| 3A | 3D | 130 |
(etc)
Provided you've happy with a rough estimate, this will split the data in to two groups.
The first group will always have as many records as possible, but no more than half of the total count (and group 2 will have the rest).
WITH
cumulative AS
(
SELECT
*,
SUM([Count]) OVER (ORDER BY Item) AS cumulativeCount,
SUM([Count]) OVER () AS totalCount
FROM
yourData
)
SELECT
MIN(item) AS frItem,
MAX(item) AS toItem,
SUM([Count]) AS TotCount
FROM
cumulative
GROUP BY
CASE WHEN cumulativeCount <= totalCount / 2 THEN 0 ELSE 1 END
ORDER BY
CASE WHEN cumulativeCount <= totalCount / 2 THEN 0 ELSE 1 END
To split the data in to 5 portions, it's similar...
GROUP BY
CASE WHEN cumulativeCount <= totalCount * 1/5 THEN 0
WHEN cumulativeCount <= totalCount * 2/5 THEN 1
WHEN cumulativeCount <= totalCount * 3/5 THEN 2
WHEN cumulativeCount <= totalCount * 4/5 THEN 3
ELSE 4 END
Depending on your data this isn't necessarily ideal
Item | Count GroupAsDefinedAbove IdealGroup
------+-------
1A | 4 1 1
2A | 5 2 1
3A | 8 2 2
If you want something that can get the two groups as close in size as possible, that's a lot more complex.