Search code examples
postgresqlquery-optimizationrow-level-security

Postgres with row level security does not optimize queries / uses one-time filters


It seems that postgres with enabled row level security on a table cannot properly optimize queries. The following example sql demonstrates the issue:

begin transaction;
    -- create demo table
    create table entries (id int primary key);

    -- setup rls
    alter table entries enable row level security;
    create policy entries_policy on entries for all using (
        current_setting('app.can_read_entries')::boolean = true
    );

    -- setup a role which cannot bypass rls
    create role authenticated;
    grant usage on schema public to authenticated;
    grant select on all tables in schema public to authenticated;

    -- configure setting (just as an example for a query condition)
    set local app.can_read_entries = true;
rollback;

When not using RLS:

explain
select * from entries
where (
    -- Simulate RLS query as superuser which bypasses RLS by adding the RLS condition manually
    current_setting('app.can_read_entries')::boolean = true
);

-- Result  (cost=0.01..35.51 rows=2550 width=4)
--   One-Time Filter: (current_setting('app.can_read_entries'::text))::boolean
--   ->  Seq Scan on entries  (cost=0.01..35.51 rows=2550 width=4)

whereas with row level security it will be

-- switch to 'authenticated' role to use rls in queries
set local role 'authenticated';

-- run the query with rls inlining its conditions
explain
select * from entries;

-- Seq Scan on entries  (cost=0.00..54.63 rows=1275 width=4)
--   Filter: (current_setting('app.can_read_entries'::text))::boolean

In the first query without rls, the optimizer properly recognizes, that the condition is static for the query and extracts it as a One-Time Filter check. But the second query, where the condition is provided by rls does not get fully optimized and postgres will evaluate the condition for every row, even though it is constant for the whole query.

In this example, there's not much data and the query is simple, however it demonstrates the issue quite easily and the same behavior can be observed in tables with lots of data and indices etc. A more elaborate example is described here: https://pastebin.com/iQu6L5Sj

I already read about the leakproof attribute on functions and tried setting the leakproof attribute on current_setting and even went and tried update pg_proc set proleakproof = true;.

However, no matter what I try, postgres will not properly optimize the query with row level security.


EDIT: Another example which does not use a function like current_setting:

-- additionally create a users table
create table users (id int primary key);
-- and then use this policy
create policy entries_policy on entries for all using (
    exists (select 1 from users where id = 1)
    -- current_setting('app.can_read_entries')::boolean = true
);

Without RLS it will run as

Result  (cost=8.17..43.67 rows=2550 width=4)
  One-Time Filter: $0
  InitPlan 1 (returns $0)
    ->  Index Only Scan using users_pkey on users  (cost=0.15..8.17 rows=1 width=0)
          Index Cond: (id = 1)
  ->  Seq Scan on entries  (cost=8.17..43.67 rows=2550 width=4)

and with RLS it will directly perform the sequence scan instead of using early-return:

Seq Scan on entries  (cost=8.17..43.67 rows=1275 width=4)
  Filter: $0
  InitPlan 1 (returns $0)
    ->  Index Only Scan using users_pkey on users  (cost=0.15..8.17 rows=1 width=0)
          Index Cond: (id = 1)

Solution

  • Yes, queries using row level security won't be optimized as well as queries without. PostgreSQL will only perform optimizations that it knows are safe. Because the optimizer's reasoning capabilities are limited, that will sometimes mean that it chooses a plan that won't be perfect.

    In your special case, the queries will perform almost the same in the case where the user is allowed to see the data. But there is a difference in the "forbidden" case: the RLS query will still scan the table. Your problem is that the cast from text to boolean, which is performed using the type input function boolin(), is not leakproof.