Search code examples
sqlpostgresqlrow-level-security

Row Level Security in Postgres on Normalized Tables


The premise

In documentation, Row Level Security seems great. Based on what I've read I can now stop creating views like this:

SELECT data.*
FROM data
JOIN user_data
ON data.id = user_data.data_id
AND user_data.role = CURRENT_ROLE

The great part is, Postgres has a great analysis for that view starting with an index scan then a hash join on the user_data table, exactly what we want to happen because it's crazy fast. Compare that with a my RLS implementation:

CREATE POLICY data_owner
ON data
FOR ALL
TO user
USING (
  (
    SELECT TRUE AS BOOL FROM (
      SELECT data_id FROM user_data WHERE user_role = CURRENT_USER
    ) AS user_data WHERE user_data.data_id = data.id
  ) = true
)
WITH CHECK (TRUE);

This bummer of a policy executes the condition for each row in the data table, instead of optimizing by scoping the query to the rows which our CURRENT_USER has access to, like our view does. To be clear, that means select * from data hits every row in the data table.

The question

How do I write a policy with an inner select which doesn't test said select on every row in the target table. Said another way: how do I get RLS to run my policy on the target table before running the actual query on the result?

p.s. I've left this question someone vague and fiddle-less, mostly because sqlfiddle hasn't hit 9.5 yet. Let me know if I need to add more color or some gists to get my question across.


Solution

  • PostgreSQL may be able to generate a better plan if you phrase the policy like this:

    ...
    USING (EXISTS
              (SELECT data_id
               FROM user_data
               WHERE user_data.data_id = data.id
                 AND role = current_user
              )
          )
    

    You should have a (PRIMARY KEY?) index ON user_data (role, data_id) to speed up nested loop joins.

    But I think that it would be a better design to include the permission information in the data table itself, perhaps using the name[] type:

    CREATE TABLE data(
       id integer PRIMARY KEY,
       val text,
       acl name[] NOT NULL
    );
    
    INSERT INTO data VALUES (1, 'one',   ARRAY[name 'laurenz', name 'advpg']);
    INSERT INTO data VALUES (2, 'two',   ARRAY[name 'advpg']);
    INSERT INTO data VALUES (3, 'three', ARRAY[name 'laurenz']);
    

    Then you can use a policy like this:

    CREATE POLICY data_owner ON data FOR ALL TO PUBLIC
       USING (acl @> ARRAY[current_user::name])
       WITH CHECK (TRUE);
    ALTER TABLE data ENABLE ROW LEVEL SECURITY;
    ALTER TABLE data FORCE ROW LEVEL SECURITY;
    

    When I SELECT, I get only the rows for which I have permission:

    SELECT id, val FROM data;
     id |  val
    ----+-------
      1 | one
      3 | three
    (2 rows)
    

    You can define a GIN index to support that condition:

    CREATE INDEX ON data USING gin (acl _name_ops);