My previous question discovered that the cause of my performance troubles is because using the indexed distance operator in Postgis is very slow on large areas that return many rows.
As a result, it seems the ideal solution may be to pick some arbitrary miles distance amount and then use the indexed distance operator (ST_DWithin
) when it's in that range (let's say, below 75 miles), and then to use the non-indexed distance operator (<->
) when it's above that range (so, 75 miles or above)
My current function looks like this:
CREATE OR REPLACE FUNCTION public.usp_get_data(i_distance_choice integer, i_longitude double precision, i_latitude double precision)
RETURNS TABLE(convo_id bigint)
LANGUAGE SQL STABLE
AS $function$
SELECT po.convo_id
FROM post po
WHERE ST_DWithin(po.geog, ST_SetSRID(ST_MakePoint(i_longitude, i_latitude), 4326)::geography, i_distance_choice * 1609.34)
ORDER BY po.reply_count DESC, convo_id DESC
LIMIT 10;
$function$;
So it looks like I need to have some sort of conditional. That is, if i_distance_choice
is below 75
, to use ST_DWithin
, and then if it is 75
or above, then to use the <->
operator instead.
Is there a way to do this with the SQL
lanaguage, or would I have to use something like plsql instead? Or is there an even better way where we can inform Postgres to not use the index for sufficiently large inputs? So that the conditional is not even required?
I don't think it can be done in pure SQL.
It is pretty straightforward to translate this to PL/pgSQL.
CREATE OR REPLACE FUNCTION public.usp_get_data(i_distance_choice integer, i_longitude double precision, i_latitude double precision)
RETURNS TABLE(convo_id bigint)
LANGUAGE plpgsql
STABLE
AS $function$
BEGIN
IF i_distance_choice < 75 then
return query SELECT po.convo_id
FROM post po
WHERE ST_DWithin(po.geog, ST_SetSRID(ST_MakePoint(i_longitude, i_latitude), 4326), i_distance_choice * 1609.34)
ORDER BY po.reply_count DESC, convo_id DESC
LIMIT 10;
ELSE
return query SELECT po.convo_id
FROM post po
WHERE po.geog<->ST_SetSRID(ST_MakePoint(i_longitude, i_latitude), 4326) < i_distance_choice * 1609.34
ORDER BY po.reply_count DESC, convo_id DESC
LIMIT 10;
END IF;
END
$function$
I verified it uses the geography index <75 and the btree (reply_count, convo_id)
index at 75 and above.