Schema:
create table records(
id varchar,
updated_at bigint
);
create index index1 on records (updated_at, id);
Query. It iterates over recently updated records. Fetches 10 records, remembers the last one and then fetches next 10 and so on.
select * from objects
where updated_at > '1' or (updated_at = '1' and id > 'some-id')
order by updated_at, id
limit 10;
It uses index, but it doesn't uses it wisely and also applies filter and processes tons of records, see Rows Removed by Filter: 31575
in query explanation below.
The strange thing is that if you remove or
and leave either left or right condition - it works well for both. But it seems like if can't figure out how to apply index correctly if both conditions are used simultaneously with or
.
Limit (cost=0.42..19.03 rows=20 width=1336) (actual time=542.475..542.501 rows=20 loops=1)
-> Index Scan using index1 on records (cost=0.42..426791.29 rows=458760 width=1336) (actual time=542.473..542.494 rows=20 loops=1)
Filter: ((updated_at > '1'::bigint) OR ((updated_at = '1'::bigint) AND ((id)::text > 'some-id'::text)))
Rows Removed by Filter: 31575
Planning time: 0.180 ms
Execution time: 542.532 ms
(6 rows)
Postgres version is 9.6
I would try this as two separate queries, combining their results like this:
select *
from
(
select *
from objects
where updated_at > 1
order by updated_at, id
limit 10
union all
select *
from objects
where updated_at = 1
and id > 'some-id'
order by updated_at, id
limit 10
) t
order by updated_at, id
limit 10
My guess is that the two queries would each optimise pretty well and running both would be more efficient than the current one.
I would also make those columns NOT NULL if possible.