I have table test
:
create table test
(
id serial
primary key,
status_id integer,
created_at timestamp with time zone default now() not null,
client_id integer
);
and I need to count all last rows with status_id
!= 3 sorted by created_at
desc.
Such table:
id, status_id, created_at, client_id
(1, 3, '2023-04-23 19:34:08.454000 +00:00', 2),
(2, 3, '2023-04-23 19:34:07.493000 +00:00', 2),
(3, 3, '2023-04-23 19:34:06.459000 +00:00', 2),
(4, 2, '2023-04-23 19:34:05.838000 +00:00', 2),
(5, 3, '2023-04-23 19:34:04.782000 +00:00', 2),
(6, 3, '2023-04-23 19:34:03.768000 +00:00', 2),
(7, 2, '2023-04-23 19:34:02.768000 +00:00', 2),
(8, 3, '2023-03-23 19:34:02.768000 +00:00', 1),
(9, 2, '2023-02-23 19:34:02.768000 +00:00', 1),
(10, 3, '2022-04-23 19:34:02.768000 +00:00', 1);
result for client with id 1 must be 3 and for client with id 2 will be 1:
client_id, count
2,3
1,1
I have this query:
select count(*), i.client_id
from public.test i
where i.created_at > (select x.created_at
from public.test x
where x.status_id != 3
and x.client_id = i.client_id
order by x.created_at desc
limit 1)
group by i.client_id;
its works but very slow. tried this index:
create index test_status_id_client_id_created_at_index
on public.test (status_id, client_id, created_at);
but index does not help.
Is there a way to optimize and speed up this query, or will it be necessary to make some kind of application that will export this data in a ready-made form in small time intervals?
explain:
GroupAggregate (cost=12.80..12.85 rows=3 width=12) (actual time=0.149..0.152 rows=2 loops=1)
Group Key: i.client_id
-> Sort (cost=12.80..12.81 rows=3 width=4) (actual time=0.144..0.145 rows=4 loops=1)
Sort Key: i.client_id
Sort Method: quicksort Memory: 25kB
-> Seq Scan on test i (cost=0.00..12.77 rows=3 width=4) (actual time=0.054..0.091 rows=4 loops=1)
Filter: (created_at > (SubPlan 1))
Rows Removed by Filter: 6
SubPlan 1
-> Limit (cost=1.16..1.16 rows=1 width=8) (actual time=0.007..0.007 rows=1 loops=10)
-> Sort (cost=1.16..1.16 rows=1 width=8) (actual time=0.007..0.007 rows=1 loops=10)
Sort Key: x.created_at DESC
Sort Method: quicksort Memory: 25kB
-> Seq Scan on test x (cost=0.00..1.15 rows=1 width=8) (actual time=0.001..0.002 rows=2 loops=10)
Filter: ((status_id <> 3) AND (client_id = i.client_id))
Rows Removed by Filter: 8
Planning Time: 0.826 ms
Execution Time: 0.314 ms
I use postgresql 14 version.
I'd replace the correlated sub-query with a regular sub-query, and join on it.
SELECT
last.client_id,
COUNT(*)
FROM
(
SELECT
client_id,
MAX(created_at) created_at
FROM
public.test
WHERE
status_id <> 3
GROUP BY
client_id
)
AS last
INNER JOIN
public.test
ON test.client_id = last.client_id
AND test.created_at > last.created_at
GROUP BY
last.client_id;
Then ensure an index on (client_id, created_at)
and a partial index on (client_id, created_at) WHERE status_id <> 3