We have a big table (events
) which contains events for a lot of devices. Each row is protected by an RLS check that verifies that the authenticated user is in an organization that owns the specific device.
We now implemented a get_events
function that returns the last _limit
events for an array of given devices (_device_ids
). Unfortunately, that function is really slow, hitting the default timeout of 8s.
-- return the _limit most recent events for each given entry in the `_device_ids` array.
RETURN QUERY
SELECT e.device_id, e.type, e.timestamp, e.data
FROM unnest(_device_ids) AS p(device_id)
CROSS JOIN LATERAL (
SELECT e.*
FROM private_schema.events e
WHERE e.device_id = p.device_id
AND e.timestamp >= _since
ORDER BY e.timestamp DESC
LIMIT _limit
) AS e
ORDER BY e.device_id, e.timestamp DESC;
We did some experiments, and found that running this function with SECURITY DEFINER
dramatically improves its performance, bringing it back to ~40ms.
At the moment I therefore implemented a work-around where a SECURITY INVOKER
function makes sure that the user has the right access, before calling the get_events
function which is now marked as SECURITY DEFINER
.
Ideally, I would prefer keeping the get_events
function as SECURITY INVOKER
(just so I can't make any mistakes). Is there a way to make the function faster?
Edit: As suggested by Florian Klein I added 'LEAKPROOF' to my functions, but that didn't change much.
Here is an attempt to get the profile of running the query that is run in the get_events
function:
ALTER FUNCTION is_auth_in_org_of_alias LEAKPROOF;
ALTER FUNCTION is_auth_member_of_org LEAKPROOF;
ALTER FUNCTION private_schema.get_events LEAKPROOF;
SET auto_explain.log_nested_statements = true;
SET auto_explain.log_min_duration=0;
EXPLAIN ANALYZE SELECT e.device_id, e.type, e.timestamp, e.data
FROM unnest(ARRAY['5a52f07f-c2fb-54b9-b611-21e8fcec678b'::uuid]) AS p(device_id)
CROSS JOIN LATERAL (
SELECT e.*
FROM private_schema.events e
WHERE e.device_id = p.device_id
ORDER BY e.timestamp DESC
LIMIT 20
) AS e
ORDER BY e.device_id, e.timestamp DESC;
with output:
Sort (cost=3841.50..3841.55 rows=20 width=55) (actual time=7475.117..7475.119 rows=20 loops=1)
Sort Key: e.device_id, e."timestamp" DESC
Sort Method: quicksort Memory: 26kB
-> Nested Loop (cost=3840.61..3841.07 rows=20 width=55) (actual time=7474.995..7475.000 rows=20 loops=1)
-> Function Scan on unnest p (cost=0.00..0.01 rows=1 width=16) (actual time=0.022..0.023 rows=1 loops=1)
-> Limit (cost=3840.60..3840.65 rows=20 width=59) (actual time=7474.968..7474.971 rows=20 loops=1)
-> Sort (cost=3840.60..3845.55 rows=1978 width=59) (actual time=7474.967..7474.968 rows=20 loops=1)
Sort Key: e."timestamp" DESC
Sort Method: top-N heapsort Memory: 47kB
-> Bitmap Heap Scan on events e (cost=89.29..3787.97 rows=1978 width=59) (actual time=9.051..7438.728 rows=44280 loops=1)
Recheck Cond: (device_id = p.device_id)
Filter: is_auth_in_org_of_alias(device_id)
Heap Blocks: exact=2036
-> Bitmap Index Scan on events_device_id (cost=0.00..88.80 rows=5934 width=0) (actual time=5.329..5.329 rows=44280 loops=1)
Index Cond: (device_id = p.device_id)
Planning Time: 1.101 ms
Execution Time: 7475.282 ms
Since this might be related, here is the RLS policy:
CREATE FUNCTION is_auth_member_of_org(_organization_id uuid)
RETURNS boolean
LANGUAGE sql
SECURITY DEFINER
AS $function$
SELECT EXISTS (
SELECT 1
FROM public.roles
WHERE user_id = auth.uid()
AND organization_id = _organization_id
)
$function$;
CREATE OR REPLACE FUNCTION is_auth_in_org_of_alias(_device_id UUID)
RETURNS BOOLEAN
SECURITY DEFINER
LANGUAGE plpgsql
AS $$
BEGIN
RETURN is_auth_member_of_org(
(SELECT organization_id FROM public.devices WHERE alias = _device_id)
);
END;
$$;
CREATE POLICY "Authenticated have full access to events table of devices in the orgs they are members in"
ON toit_artemis.events
FOR ALL
TO authenticated
USING (public.is_auth_in_org_of_alias(device_id))
WITH CHECK (public.is_auth_in_org_of_alias(device_id));
Looks like you could simplify your policy by inlining the sql directly in the RLS:
CREATE POLICY "Authenticated have full access to events table of devices in the orgs they are members in"
ON toit_artemis.events
FOR ALL
TO authenticated
USING (exists(
select from public.roles r
join public.device d using (organization_id)
where r.user_id = auth.uid()
and d.alias = toit_artemis.events.device_id
-- maybe more is needed, IDK
))
didn't test it as I don't have access to the full schema, but I'd be interested in seeing the plan of this one, where you shouldn't see
Filter: is_auth_in_org_of_alias(device_id)
in the plan anymore, which might be the reason why it's slow, as it might make 44000 calls to your UDF, but I'm not sure exactly (see https://explain.dalibo.com/plan/8c8d3b99438egccg#plan/node/6)