I need to scoop up a random sample of 1% of the records in a table (with the number of rows growing every second).
My idea is to
SELECT DISTINCT
random(),
name,
age,
registrationNumber
FROM everGrowingTable
ORDER BY random desc
LIMIT (
(select count(*) from everGrowingTable) * 0.01
) -- this is attempting to get 1%
The compiler complains about the *
operator. It is fine when I hard code the table size however.
I've tried IBM documentation, but this talks about calculations using known values, not values that grow (such is that case in my table)
There doesn't seem to be a Aginity SQL function that does this. I've notice the MINUS function in the Aginity Workbench Intellisense, but alas, no multiplication equivalent.
You could use window functions in a subquery to assign a random number to each record and compute the total record number, and then do the filtering in the outer query :
SELECT name, age, registrationNumber
FROM (
SELECT
name,
age,
registrationNumber,
ROW_NUMBER() OVER(ORDER BY random()) rn,
COUNT(*) OVER() cnt
FROM everGrowingTable
) x
WHERE rn <= cnt / 100
ORDER BY rn