I have table, where I put lets call it manual values that are used later in my code. This table looks like that:
subId | MonthNo | PackagesNumber | Country | EntryMethod | PaidAmount | Version
1 | 201701 | 223 | NO | BCD | 44803 | 2
2 | 201701 | 61 | NO | GHI | 11934 | 2
3 | 201701 | 929 | NO | ABC | 88714 | 2
4 | 201701 | 470 | NO | DEF | 98404 | 2
5 | 201702 | 223 | NO | BCD | 28225 | 2
All I have to do is, to divide those values into single rows, at the level of single package. In example, there are 223 packages in January 2017 in Country NO with EntryMethod BCD, so I want 223 separate rows. PaidAmount should be also divided by number of PackagesNumber.
The problem is I have to associate date to every record. Records should be distributed evenly through whole month. I have Date dimension, that I can intersect with my table by pulling month and year separately from MontNo. For example, January 2017, EntryMethod BCD I have packages, so it's ~7 packages per day.
That's what I want:
subId | Date | Country | Packages | EntryMethod | PaidAmount | Version
1 | 01.01.2017 | NO | 1 | BCD | 200.910313901345 | 2
2 | 01.01.2017 | NO | 1 | BCD | 200.910313901345 | 2
3 | 01.01.2017 | NO | 1 | BCD | 200.910313901345 | 2
4 | 01.01.2017 | NO | 1 | BCD | 200.910313901345 | 2
5 | 01.01.2017 | NO | 1 | BCD | 200.910313901345 | 2
6 | 01.01.2017 | NO | 1 | BCD | 200.910313901345 | 2
7 | 01.01.2017 | NO | 1 | BCD | 200.910313901345 | 2
8 | 02.01.2017 | NO | 1 | BCD | 200.910313901345 | 2
Bonus: I wrote code, that's dividing Packages into single records, and it's putting first day of each month as date.
SELECT
Date =
(
SELECT TOP 1
date
FROM dim_Date dim
WHERE dim.Month = a.Month
AND dim.Year = a.Year
)
, Country
, EntryMethod
, Deliveries = 1
, PaidAmount = NULLIF(PaidAmount, 0) / PackagesNumber
, SubscriptionId = 90000000 + ROW_NUMBER() OVER(ORDER BY n.number)
, Version
FROM
(
SELECT
[Year] = LEFT(MonthNo, 4)
, [Month] = RIGHT(MonthNo, 2)
, Country
, EntryMethod
, PackagesNumber
, PaidAmount
, Version
FROM tgm.rep_PredictionsReport_ManualValues tgm
/*WHERE MonthNo = 201701*/
) a
JOIN master..spt_values n
ON n.type = 'P'
AND n.number < CAST(PackagesNumber AS INT);
EDIT: I made some progress. I used NTILE function, to divide rows into groups. The only thing that changed is Date from top level select. It looks like that now:
Date = concat([Year], '-', [Month], '-', case when ntile(31) over(order by n.number) < 10 then '0' + cast(ntile(31) over(order by n.number) as varchar(2)) else cast(ntile(31) over(order by n.number) as varchar(2)) end)
Explanation: I am creating Date filed using Year and Month fields, and NTILE over number of days in month(now it's static number, but later to be changed). Results aren't good as I'd expect, it's creating groups twice as big as they should be(14 instead of 7 rows in each date).
You can accomplish this using the modulo operator, which allows you to divide items into a set number of categories.
Here is a full test: http://rextester.com/TOROA96856
Here is the relevant query:
--recursive query to expand each row.
with expand_rows (subid,monthno,month,packagesnumber,paidamount) as (
select subid,monthno,month,packagesnumber,(paidamount+0.0000)/packagesnumber
from initial_table
union all
select subid,monthno,month,packagesnumber-1,paidamount
from expand_rows where packagesnumber >1
)
select expand_rows.*,(packagesnumber % numdays)+1 day, paidamount from expand_rows
join dayspermonth d on
d.month = expand_rows.month
order by subid, day
option (maxrecursion 0)
(packagesnumber % numdays)+1
is the modulo operation that assigns items to a day.
Note that I precomputed a table of the number of days in each month in order to use in the query. I also simplified the problem slightly for purposes of the answer (added a pure month column because I didn't want to mess around with replicating your date dimension).
You may need to tweak the modulo query if you care where the extra items end up when things don't divide evenly (e.g. if you have 32 items in January, which day has an extra item?). In this example the second day of the month tends to get the most (because of adding 1 to account for the fact that the last day of the month ends up 0). If you want the extra days to fall at the beginning of the month you could use a case statement that converts 0 to the number of days in the month, instead.