postgresql statistics histogram sql-execution-plan

Row estimation for JOIN queries

How does PostgreSQL estimate the number of rows in JOIN query like:

EXPLAIN 
SELECT * 
FROM R, S 
WHERE (R.StartTime < S.EndTime) AND (S.StartTime < R.EndTime);

Solution

There is a chapter in the manual addressing your question exactly:

Row Estimation Examples

With explanation for what Laurenz provided, among other things.

But that wasn't the full story, yet. We also need the number of rows (cardinalities) of underlying tables. Postgres uses estimate_rel_size() defined in src/backend/utils/adt/plancat.c:

 /*
  * estimate_rel_size - estimate # pages and # tuples in a table or index
  *
  * We also estimate the fraction of the pages that are marked all-visible in
  * the visibility map, for use in estimation of index-only scans.
  *
  * If attr_widths isn't NULL, it points to the zero-index entry of the
  * relation's attr_widths[] cache; we fill this in if we have need to compute
  * the attribute widths for estimation purposes.
  */
 void
 estimate_rel_size(Relation rel, int32 *attr_widths,
                   BlockNumber *pages, double *tuples, double *allvisfrac)
 ...

Here is a minimal SQL query to reproduce the calculation (ignoring some corner cases):

SELECT (reltuples / relpages * (pg_relation_size(oid) / 8192))::bigint
FROM   pg_class
WHERE  oid = 'mytable'::regclass;  -- your table here

More details:

Fast way to discover the row count of a table in PostgreSQL

Example

CREATE TEMP TABLE r(id serial, start_time timestamptz, end_time timestamptz);
CREATE TEMP TABLE s(id serial, start_time timestamptz, end_time timestamptz);

INSERT INTO r(start_time, end_time)
SELECT now(), now()  -- actual values don't matter for this particular case
FROM generate_series (1, 5000);

INSERT INTO s(start_time, end_time)
SELECT now(), now()
FROM generate_series (1, 10000);

VACUUM r, s;  -- set reltuples & relpages in pg_class

-- add 2000 rows to S
INSERT INTO s(start_time, end_time)
SELECT now(), now()
FROM generate_series (1, 2000);

pg_class still has 5000 and 10000 reltuples, but we know there are 5000 & 12000 rows in R and S. (Since these are temporary tables, they are not covered by autovacuum, so numbers are never updated automatically.) Check:

SELECT relname, reltuples, relpages  -- 5000 | 10000
FROM   pg_class c
WHERE  c.oid IN ('pg_temp.r'::regclass, 'pg_temp.s'::regclass);

SELECT count(*) FROM r; -- 5000
SELECT count(*) FROM s; -- 12000

Query plan:

EXPLAIN
SELECT *
FROM r, s
WHERE (r.start_time < s.end_time) AND (s.start_time < r.end_time);

'Nested Loop  (cost=0.00..1053004.31 rows=6683889 width=40)'
'  Join Filter: ((r.start_time < s.end_time) AND (s.start_time < r.end_time))'
'  ->  Seq Scan on s  (cost=0.00..197.31 rows=12031 width=20)'
'  ->  Materialize  (cost=0.00..107.00 rows=5000 width=20)'
'        ->  Seq Scan on r  (cost=0.00..82.00 rows=5000 width=20)'
'JIT:'
'  Functions: 6'
'  Options: Inlining true, Optimization true, Expressions true, Deforming true'

Postgres estimates rows=12031 for table s. A pretty good estimate, the algorithm worked.
The estimate is more easily thrown off by deleting rows, as the physical size of the table doesn't shrink automatically. It's a good idea to VACUUM ANALYZE after a major DELETE. Or even VACUUM FULL ANALYZE. See:

VACUUM returning disk space to operating system

Postgres expects rows=6683889, which matches our expectation (as per Laurenz' explanation):

SELECT 5000 * 12031 * 0.3333333333333333^2  -- 6683888.89

Better query

Your example query is just that: an example. But it happens to be a poor one, as the same can be achieved with range types and operators more efficiently. Specifically with tstzrange and &&:

Selectivity for &&?

SELECT oprjoin  -- areajoinsel
FROM pg_operator
WHERE oprname = '&&'
AND oprleft = 'anyrange'::regtype
AND oprright = 'anyrange'::regtype;

The source code in `src/backend/utils/adt/geoselfuncs.c:

 Datum
 areajoinsel(PG_FUNCTION_ARGS)
 {
     PG_RETURN_FLOAT8(0.005);
 }

Much more selective 0.005 << 0.333! And typically more realistic.

EXPLAIN
SELECT *
FROM r, s
WHERE tstzrange(r.start_time, r.end_time) && tstzrange(s.start_time, s.end_time);

Happens to be exactly equivalent, since tstzrange defaults to including the lower bound and excluding the upper bound. I get this query plan:

'Nested Loop  (cost=0.00..1203391.81 rows=300775 width=40)'
'  Join Filter: (tstzrange(r.start_time, r.end_time) && tstzrange(s.start_time, s.end_time))'
'  ->  Seq Scan on s  (cost=0.00..197.31 rows=12031 width=20)'
'  ->  Materialize  (cost=0.00..107.00 rows=5000 width=20)'
'        ->  Seq Scan on r  (cost=0.00..82.00 rows=5000 width=20)'
'JIT:'
'  Functions: 6'
'  Options: Inlining true, Optimization true, Expressions true, Deforming true'

Our expectation:

SELECT 5000 * 12031 * 0.005  -- 300775.000

It's a Bingo!
And this query can be supported with an index efficiently, changing the game ...