How does PostgreSQL estimate the number of rows in JOIN query like:
EXPLAIN
SELECT *
FROM R, S
WHERE (R.StartTime < S.EndTime) AND (S.StartTime < R.EndTime);
There is a chapter in the manual addressing your question exactly:
With explanation for what Laurenz provided, among other things.
But that wasn't the full story, yet. We also need the number of rows (cardinalities) of underlying tables. Postgres uses estimate_rel_size()
defined in src/backend/utils/adt/plancat.c
:
/*
* estimate_rel_size - estimate # pages and # tuples in a table or index
*
* We also estimate the fraction of the pages that are marked all-visible in
* the visibility map, for use in estimation of index-only scans.
*
* If attr_widths isn't NULL, it points to the zero-index entry of the
* relation's attr_widths[] cache; we fill this in if we have need to compute
* the attribute widths for estimation purposes.
*/
void
estimate_rel_size(Relation rel, int32 *attr_widths,
BlockNumber *pages, double *tuples, double *allvisfrac)
...
Here is a minimal SQL query to reproduce the calculation (ignoring some corner cases):
SELECT (reltuples / relpages * (pg_relation_size(oid) / 8192))::bigint
FROM pg_class
WHERE oid = 'mytable'::regclass; -- your table here
More details:
CREATE TEMP TABLE r(id serial, start_time timestamptz, end_time timestamptz);
CREATE TEMP TABLE s(id serial, start_time timestamptz, end_time timestamptz);
INSERT INTO r(start_time, end_time)
SELECT now(), now() -- actual values don't matter for this particular case
FROM generate_series (1, 5000);
INSERT INTO s(start_time, end_time)
SELECT now(), now()
FROM generate_series (1, 10000);
VACUUM r, s; -- set reltuples & relpages in pg_class
-- add 2000 rows to S
INSERT INTO s(start_time, end_time)
SELECT now(), now()
FROM generate_series (1, 2000);
pg_class
still has 5000 and 10000 reltuples, but we know there are 5000 & 12000 rows in R and S. (Since these are temporary tables, they are not covered by autovacuum, so numbers are never updated automatically.) Check:
SELECT relname, reltuples, relpages -- 5000 | 10000
FROM pg_class c
WHERE c.oid IN ('pg_temp.r'::regclass, 'pg_temp.s'::regclass);
SELECT count(*) FROM r; -- 5000
SELECT count(*) FROM s; -- 12000
Query plan:
EXPLAIN
SELECT *
FROM r, s
WHERE (r.start_time < s.end_time) AND (s.start_time < r.end_time);
'Nested Loop (cost=0.00..1053004.31 rows=6683889 width=40)'
' Join Filter: ((r.start_time < s.end_time) AND (s.start_time < r.end_time))'
' -> Seq Scan on s (cost=0.00..197.31 rows=12031 width=20)'
' -> Materialize (cost=0.00..107.00 rows=5000 width=20)'
' -> Seq Scan on r (cost=0.00..82.00 rows=5000 width=20)'
'JIT:'
' Functions: 6'
' Options: Inlining true, Optimization true, Expressions true, Deforming true'
Postgres estimates rows=12031
for table s
. A pretty good estimate, the algorithm worked.
The estimate is more easily thrown off by deleting rows, as the physical size of the table doesn't shrink automatically. It's a good idea to VACUUM ANALYZE
after a major DELETE
. Or even VACUUM FULL ANALYZE
. See:
Postgres expects rows=6683889
, which matches our expectation (as per Laurenz' explanation):
SELECT 5000 * 12031 * 0.3333333333333333^2 -- 6683888.89
Your example query is just that: an example. But it happens to be a poor one, as the same can be achieved with range types and operators more efficiently. Specifically with tstzrange
and &&
:
Selectivity for &&
?
SELECT oprjoin -- areajoinsel
FROM pg_operator
WHERE oprname = '&&'
AND oprleft = 'anyrange'::regtype
AND oprright = 'anyrange'::regtype;
The source code in `src/backend/utils/adt/geoselfuncs.c:
Datum
areajoinsel(PG_FUNCTION_ARGS)
{
PG_RETURN_FLOAT8(0.005);
}
Much more selective 0.005 << 0.333! And typically more realistic.
EXPLAIN
SELECT *
FROM r, s
WHERE tstzrange(r.start_time, r.end_time) && tstzrange(s.start_time, s.end_time);
Happens to be exactly equivalent, since tstzrange
defaults to including the lower bound and excluding the upper bound. I get this query plan:
'Nested Loop (cost=0.00..1203391.81 rows=300775 width=40)'
' Join Filter: (tstzrange(r.start_time, r.end_time) && tstzrange(s.start_time, s.end_time))'
' -> Seq Scan on s (cost=0.00..197.31 rows=12031 width=20)'
' -> Materialize (cost=0.00..107.00 rows=5000 width=20)'
' -> Seq Scan on r (cost=0.00..82.00 rows=5000 width=20)'
'JIT:'
' Functions: 6'
' Options: Inlining true, Optimization true, Expressions true, Deforming true'
Our expectation:
SELECT 5000 * 12031 * 0.005 -- 300775.000
It's a Bingo!
And this query can be supported with an index efficiently, changing the game ...