I'm writing a PL/PgSQL function to take an ltree
and break out subtrees of various lengths. I'm going to run this against a sample of some real data to get frequency counts on subtrees, regardless of their position in the path/tree.
I'm getting hung up on returning the results. I can generate what I want okay, I just can't figure out how to get them into a returned result. I don't mind using CREATE TYPE
, although RETURNS TABLE
seems like it should be enough. I'd rather not use RETURN SETOF record
, if possible.
I've tried lots of permutations, but without luck. Can someone spot my error? Here's the function code:
DROP FUNCTION IF EXISTS tools.ltree_subtrees (ltree, int4);
CREATE OR REPLACE FUNCTION tools.ltree_subtrees (
source_tree ltree,
shortest_subtree int4 DEFAULT 3) -- 1 & 2 explode out the result set size.
RETURNS TABLE (
substree_start int4,
subtree text)
LANGUAGE plpgsql
AS
$BODY$
DECLARE
-- Unwind the packed.ltree.values into a text[array].
-- Efficient? Doubt it. Plenty of options for production.
label_array text[] := string_to_array(ltree2text(source_tree), '.');
source_tree_length int4 := nlevel(source_tree);
subtree_size int4 := shortest_subtree;
out_row record;
BEGIN
-- Loop over the whole label array, sliding from position 1 to the end
FOR subtree_size IN shortest_subtree..source_tree_length LOOP
RAISE NOTICE 'subtree_size(%) ', subtree_size;
FOR label_index IN 1..(source_tree_length - subtree_size + 1) LOOP
SELECT label_index,
label_array [label_index : label_index + subtree_size - 1],
label_index,
subtree_size
INTO out_row;
RETURN NEXT;
RAISE NOTICE 'out_row = (%)', out_row;
END LOOP;
END LOOP;
END
$BODY$;
Here's a sample call with output:
select * from ltree_subtrees(text2ltree('My.Best.Dog.Is.Named.Lilli'));
/*
+----------------+---------+
| substree_start | subtree |
+----------------+---------+
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
| NULL | NULL |
+----------------+---------+
*/
That looks like the right result row count, but the contents are all NULL
. The RAISE NOTICE
reports shows I'm generating the outputs I want....I'm just not outputing them.
Initializing connection...
Connection ready. Took 0.054 seconds.
NOTICE: subtree_size(3)
NOTICE: out_row = ((1,"{My,Best,Dog}",1,3))
NOTICE: out_row = ((2,"{Best,Dog,Is}",2,3))
NOTICE: out_row = ((3,"{Dog,Is,Named}",3,3))
NOTICE: out_row = ((4,"{Is,Named,Lilli}",4,3))
NOTICE: subtree_size(4)
NOTICE: out_row = ((1,"{My,Best,Dog,Is}",1,4))
NOTICE: out_row = ((2,"{Best,Dog,Is,Named}",2,4))
NOTICE: out_row = ((3,"{Dog,Is,Named,Lilli}",3,4))
NOTICE: subtree_size(5)
NOTICE: out_row = ((1,"{My,Best,Dog,Is,Named}",1,5))
NOTICE: out_row = ((2,"{Best,Dog,Is,Named,Lilli}",2,5))
NOTICE: subtree_size(6)
NOTICE: out_row = ((1,"{My,Best,Dog,Is,Named,Lilli}",1,6))
SELECT 10
Statement completed successfully. (10 results). 0.009 seconds. (Line 1).
Finished. Total time: 0.064 seconds.
I'm hopeful that I'm missing something simple 🤞. I've tried RETURN SETOF record
, RETURN NEXT
, RETURN NEXT row_out
(parameter not allowed error.)
Thanks for any help.
Help received! I'm posting a corrected version of the function, only to show the syntax involved in looping, building synthetic rows, and returning them. The function itself if a bit...dubious. Still, good enough for some experiments.
DROP FUNCTION IF EXISTS ltree_subtrees (ltree, int4);
CREATE OR REPLACE FUNCTION ltree_subtrees (
source_tree ltree,
shortest_subtree int4 DEFAULT 3) -- 1 & 2 explode out the result set size.
RETURNS TABLE (
subtree_start int4,
subtree ltree)
LANGUAGE plpgsql
AS
$BODY$
DECLARE
label_array text[] := string_to_array(ltree2text(source_tree), '.');
source_tree_length int4 := nlevel(source_tree);
subtree_size int4 := shortest_subtree;
BEGIN
FOR subtree_size IN shortest_subtree..source_tree_length LOOP
FOR label_index IN 1..(source_tree_length - subtree_size + 1) LOOP
-- Do *not* build a record, assign the variables declared in the RETURNS TABLE
-- clause and then RETURN NEXT to append that to the output.
subtree_start := label_index;
subtree := text2ltree(array_to_string(label_array [label_index : label_index + subtree_size - 1], '.'));
RETURN NEXT;
END LOOP;
END LOOP;
END
$BODY$;
-- Try it:
select subtree_start,
subtree,
nlevel(subtree) as subtree_length
from ltree_subtrees(text2ltree('My.Best.Dog.Is.Named.Lilli'))
/*
+---------------+----------------------------+----------------+
| subtree_start | subtree | subtree_length |
+---------------+----------------------------+----------------+
| 1 | My.Best.Dog | 3 |
| 2 | Best.Dog.Is | 3 |
| 3 | Dog.Is.Named | 3 |
| 4 | Is.Named.Lilli | 3 |
| 1 | My.Best.Dog.Is | 4 |
| 2 | Best.Dog.Is.Named | 4 |
| 3 | Dog.Is.Named.Lilli | 4 |
| 1 | My.Best.Dog.Is.Named | 5 |
| 2 | Best.Dog.Is.Named.Lilli | 5 |
| 1 | My.Best.Dog.Is.Named.Lilli | 6 |
+---------------+----------------------------+----------------+
*/
You have to assign values to the output variables subtree
and subtree_start
before calling RETURN NEXT
.