Search code examples
sqlpostgresqlsql-execution-plan

Modify Postgres 9.0 query plan


I'm looking for more information how to modify Postgres 9..0 query plan.

I have query:

SELECT
    max(creation_date) 
FROM 
    statistics_loged_users 
WHERE
    school_id    = 338 and 
    group_id     = 3 and 
    usr_id       = 243431;

And explain analyze output:

"Aggregate  (cost=1518.56..1518.57 rows=1 width=8) (actual time=410.459..410.459 rows=1 loops=1)"
"  ->  Bitmap Heap Scan on statistics_loged_users  (cost=993.96..1518.55 rows=1 width=8) (actual time=410.025..410.406 rows=210 loops=1)"
"        Recheck Cond: ((group_id = 3) AND (usr_id = 243431))"
"        Filter: (school_id = 338)"
"        ->  BitmapAnd  (cost=993.96..993.96 rows=133 width=0) (actual time=409.521..409.521 rows=0 loops=1)"
"              ->  Bitmap Index Scan on statistics_loged_users_idx2  (cost=0.00..496.85 rows=26669 width=0) (actual time=375.770..375.770 rows=3050697 loops=1)"
"                    Index Cond: (group_id = 3)"
"              ->  Bitmap Index Scan on statistics_loged_users_idx  (cost=0.00..496.85 rows=26669 width=0) (actual time=0.077..0.077 rows=210 loops=1)"
"                    Index Cond: (usr_id = 243431)"
"Total runtime: 411.419 ms"

We can see that first filter is by group_id. This table is very very big :) So there is a lot of rows where group_id is the same, but much less rows with the same usr_id.

Question is how can I tell query plan that first filter must be usr_id.

I create index on group_id and usr_id and I got performance, but there I need to know how to modify query plan, it's for future :)


Solution

  • The PostgreSQL planner doesn't really accept hints in the way that you want. The easiest way to achieve what you want is to rewrite your query.

    Analysing your EXPLAIN ANALYZE output, it's clear that most of the time is spent in the following section:

    " -> Bitmap Index Scan on statistics_loged_users_idx2 (cost=0.00..496.85 rows=26669 width=0) (actual time=375.770..375.770 rows=3050697 loops=1)"

    " Index Cond: (group_id = 3)"

    If you rewrite your query in order to first look only for usr_id and school_id you will get what you want.

    SELECT
        max(creation_date) 
    FROM 
    (
        SELECT 
            group_id, creation_date
        FROM
            statistics_loged_users 
        WHERE
            school_id    = 338 and 
            usr_id       = 243431
    ) AS cd
    WHERE 
    group_id = 3;