I have a table (t
) which contains a column (c
) of JSONB objects, each containing an array (a
) with a varying length between 1-10 (inclusive). I need to build a query that selects 1000 rows from t
where c
includes 100 random rows of each possible length of a
. What would be the most concise way to write this query? My current query structure looks like this:
WITH length_1 AS (
SELECT *
FROM t
WHERE JSONB_ARRAY_LENGTH(c -> 'a') = 1
ORDER BY RANDOM()
LIMIT 100
),
length_2 AS (
SELECT *
FROM t
WHERE JSONB_ARRAY_LENGTH(c -> 'a') = 2
ORDER BY RANDOM()
LIMIT 100
)
...
SELECT *
FROM length_1
UNION
SELECT *
FROM length_2
...
You can use a window function to label each row with a row number within its partition, then use an outer select to limit to 100 for each partition.
select * from (
select t.*, row_number() over (partition by jsonb_array_length(c->'a') order by random()) as rn from t
) foo where rn<=100;
The two levels are needed because you can't use a window function in a WHERE or a HAVING.