I have a table with an id, time, and PostGIS point object column. The table contains multiple points for each ID at different times. I want to find the average distance of one ID's set of points to another ID's set of points, for each point that happens at the same time, and find that average for all other ID's set of points compared to that original ID's set.
So far I have this function:
CREATE TYPE score AS (id int, dist float);
CREATE OR REPLACE FUNCTION avgdist(id1 int) RETURNS TABLE (id int, dist float) LANGUAGE plpgsql AS
$func$
DECLARE
scores score;
id2 int;
set2 record;
begin
id2:= 0;
IF (id1 = id2 ) THEN
id2:= 1;
END IF;
FOR set2 IN
SELECT my_table.id, my_table.time, my_table.geom FROM my_table WHERE my_table.id = id2 loop
id2:= id2 + 1;
CONTINUE WHEN id1 = id2;
EXECUTE 'WITH origin AS (SELECT time, id, geom FROM my_table WHERE id = $1)
SELECT id, avg(ST_Distance(origin.geom, $2))
FROM origin WHERE origin.time = $3
group by origin.id
ORDER BY id'
INTO scores
USING id1, set2.geom, set2.time;
end loop;
RETURN;
end
$func$;
When calling this function with select * from avgdist(2)
I do not get any results, same with slect avgdist(2)
. I do get results when I run the query within the execute on it's own in psql with plugged in values.
I am new to making functions in sql, so i don't really understand how to put the scores in the return table when that table doesn't have a name. And I can't seem to use RETURN QUERY
since I need to return the results for all the queries in the loop.
Help on this would be appreciated, or is there a better way to achieve my desired results without looping?
There are more issues:
Clause RETURNS TABLE
defines OUT
variables. You can use these variables.
CREATE OR REPLACE FUNCTION foo()
RETURNS TABLE (r1 int, r2 int)
-- you don't need aux variables for result
...
r1 := 10; r2 := 10;
RETURN NEXT;
EXECUTE INTO
can store only first row (or value) from of dynamic query result.
RETURN
stops function evaluation. You should to use RETURN NEXT
or RETURN QUERY
.
FOR x, y IN SELECT ..
LOOP
-- when only first row of result is interesting
EXECUTE '..' INTO r1, r2 USING x, y;
RETURN NEXT;
END LOOP
or with RETURN QUERY EXECUTE
FOR x, y IN SELECT ..
LOOP
RETURN QUERY EXECUTE '..' USING x, y;
END LOOP
I don't understand to your code, but it looks so you don't need to use dynamic SQL - EXECUTE
commands. The dynamic SQL is necessary only when you have a variable on place of SQL identifiers. There is not this case. And same case - why you use CTE WITH
clause. It is not necessary - and it can has negative impacts (unwanted materialization - fixed by PostgreSQL 12). Use just RETURN QUERY
(without EXECUTE
) and pass query as query (not as a string).
The documentation about these commands is good - https://www.postgresql.org/docs/current/plpgsql-control-structures.html#PLPGSQL-STATEMENTS-RETURNING