Search code examples
postgresql

Pick random values from different table-columns (postgresql)


Here is the outline of the problem Two tables Calendar and Transactions. Both have an id column

I want to create a query, such that for each Calendar.id , I have a random Transactions.id

All my tries to produce random values for each Calendar.id do not work since the keep giving back the same value.

For example

SELECT "Calendar".id,
      (SELECT id FROM Transactions ORDER BY Random() LIMIT 1) as tr_id
FROM "Calendar"

I also tried

WITH cte_tr_array AS (SELECT ARRAY_AGG("Transactions".id) AS id_array
                      FROM "Transactions"
                      LIMIT 1
                     )
SELECT * ,
  (SELECT * FROM (SELECT cte_tr_array.id_array[FLOOR(RANDOM() * array_length(cte_tr_array.id_array,1))] FROM cte_tr_array) as tr_array) as transaction_id
FROM "Calendar";

But again, i get the same value for the transaction_id. it does not select a new value every time from the array.

What did work was the following

WITH cte_tr_array           AS (SELECT ARRAY_AGG("Transactions".id) AS id_array
                                FROM "Transactions"
                                LIMIT 1
                               ),
     cte_count_transactions AS (SELECT COUNT(*) AS count
                                FROM "Transactions"
                               ),
SELECT *, (SELECT id_array[aug_c.random_id] FROM cte_tr_array) AS transaction_id
FROM (SELECT *,
             FLOOR(RANDOM() * (SELECT * FROM cte_count_transactions LIMIT 1)) AS random_id,
      FROM "Calendar"
     ) AS aug_c;

Is there a simpler way to do what I want? What if I want to pick values from 10 different tables-columns?

Thank you in advance for your time.


Solution

  • Since your subquery is not correlated to the outer query, it only gets run once and the results are just repeated. You can introduce a dummy reference to the outer query into the inner one in order to introduce a correlation.

    Assuming the Calendar id column is numeric, you can just add that to the random value.

    SELECT "Calendar".id,
          (SELECT id FROM Transactions ORDER BY Random()+"Calendar".id LIMIT 1) as tr_id
    FROM "Calendar"