Have a base dataset which is a set of ID's, F.E.
000
111
222
333
444
555
666
777
888
999
and a keys dataset
1 - 20
2 - 40
3 - 40
The result should be a random pairing of the base dataset with the keys dataset according to the defined percentage, meaning 20% will recieve 1, 40% will recieve 2 and 40% will recieve 3
F.E. - first run
000 - 1
111 - 2
222 - 3
333 - 1
444 - 2
555 - 3
666 - 2
777 - 3
888 - 2
999 - 3
second run
000 - 2
111 - 3
222 - 2
333 - 3
444 - 2
555 - 3
666 - 1
777 - 3
888 - 2
999 - 1
etc.
Using SQL server 2014
Thought of solutions that involve loops and cusrors and temp tables but I was wondering if there is a "cleaner" and most probably more efficient solution...
Any ideas?
Here is one way using Numbers
table.
;WITH base_dataset
AS (SELECT *,
Row_number()OVER(ORDER BY id) AS rn
FROM (VALUES (000),
(111),
(222),
(333),
(444),
(555),
(666),
(777),
(888),
(999)) tc (ID)),
keys
AS (SELECT *
FROM (VALUES (1,20),
(2,50),
(3,30)) tc(val, per)),
num_gen
AS (SELECT 1 AS num,
Count(1) AS cnt
FROM base_dataset
UNION ALL
SELECT num + 1,
cnt
FROM num_gen
WHERE num < cnt)
SELECT Id,val
FROM (SELECT Row_number()OVER(ORDER BY newid()) rn,
val
FROM num_gen n
JOIN keys k
ON n.num <= (k.per/100.0) * cnt) a
JOIN base_dataset d
ON d.rn = a.rn
I have used Recursive CTE
to generate numbers you can create a numbers table in database and use it