I can do two select queries:
But instead, I want to know if I can do the above in a single query?
I know it's possible using a subquery as in
SELECT * FROM customers
WHERE cust_id=(SELECT cust_id FROM customers LIMIT 1)
Is this an efficient way, or there is something better?
The requirement is to archive records that belong to a random cust_id or it might be correct for it to be the oldest cust_id (using aws lambda function in Python which runs periodically). Hence, I would like to fetch a cust_id and every other row with the same cust_id in a single transaction.
Using Postgres 10.5, with the DB table definition as below.
id BIGINT PRIMARY KEY,
cust_id VARCHAR(100) NOT NULL,
created TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
cust_data JSONB
Sample input
|id|cust_id| cust_data | created |
|--|-------|------------------------|--------------------|
|8 |cust9 |{"a": {"b": "c"}} |25/11/2022 01:05:39|
|25|cust1 |{"x": "y"} |05/12/2022 14:59:21|
|40|cust9 |{"d": {"b": {"c": "z"}}}|07/12/2022 11:29:14|
|87|cust2 |{"r": {"s": "t"}} |13/12/2022 21:10:18|
|99|cust1 |{"p": "q"} |20/12/2022 14:59:21|
Expected output
|id|cust_id| cust_data | created |
|--|-------|------------------------|--------------------|
|8 |cust9 |{"a": {"b": "c"}} |25/11/2022 01:05:39|
|40|cust9 |{"d": {"b": {"c": "z"}}}|07/12/2022 11:29:14|
OR
Expected output
|id|cust_id| cust_data | created |
|--|-------|------------------------|--------------------|
|25|cust1 |{"x": "y"} |05/12/2022 14:59:21|
|99|cust1 |{"p": "q"} |20/12/2022 14:59:21|
OR
Expected output
|id|cust_id| cust_data | created |
|--|-------|------------------------|--------------------|
|87|cust2 |{"r": {"s": "t"}} |13/12/2022 21:10:18|
During one query, we only want one of these expected outputs. Also need to mention that the created here is not the actual timestamp when this entry was created, as the entries made into this table are copied from another. So I think we can't decide which is the cust_id that has got the oldest entry.
To get all rows for a single, arbitrary customer in a single query, WITH TIES
in Postgres 13 or later should be most efficient:
SELECT *
FROM customers
ORDER BY cust_id
FETCH FIRST 1 ROWS WITH TIES;
If the table is big, support this query with an index on (cust_id)
.
See:
Postgres 10 has reached EOL in 2022, so you need to upgrade ASAP anyway.