Search code examples
postgresqlbooleansql-execution-planb-tree-index

Boolean column in index and filter sections of explain in postgres


I have a table with boolean column - "is_woman" bool DEFAULT true

I have an btree index with this column (and some others like age, town and so on) - is_woman ASC NULLS LAST

I have a query with this column - is_woman IS FALSE

As a result I get an explain:

-> Index Scan using woman_idx on superjob (cost=... rows=... width=32) (actual time=... rows=... loops=1)
 Index Cond: (town = 1) AND (is_woman = false) AND (age >= 35) AND (age <= 60))
 Filter: (is_woman IS FALSE)

Why there are two is_woman conditions? One in index section, second - in filter?

UPDATED

With @dmitry's help I've created two partial indexes: one for man with is_woman is false, second for woman with is_woman is true.

Explain for the same query:

Bitmap Index Scan on is_woman_woman_idx  (...) (actual time=469.446..469.446 rows=406867 loops=1)
    Index Cond: ((age >= 1) AND (town = 1))
Execution time: 1827.239 ms

No Filter section and this query works faster:

  • actual time 2.227..2754.378469.446..469.446
  • execution time 2792.804 ms1827.239 ms

Solution

  • UPDATED

    I cannot see anything wrong with this EXPLAIN except for the fact the you are indexing boolean column (obviously, column has low cardinality field). It might be beneficial to use Partial Index with definition something like:

    CREATE INDEX ON yourtable WHERE is_woman = FALSE;
    

    As for the question itself, you have a query with WHERE ... condition. Postgres planner/optimizer decided to use woman_idx index scan instead of sequential one - Index Cond is indicated for index scan.

    If you can see Filter statement it means that the plan node checks the condition for each row it scans (in our case each woman_idx scan), and outputs only the ones that pass the condition. For details check EXPLAIN documentation.