I am using PostgreSQL and have a table with a path column that is of type ltree
.
The problem I am trying to solve is: given the whole tree structure, what parent has the most children excluding the root.
Sample data looks like this:
path column = ; has a depth of 0 and has 11 children its id is 1824 # dont want this one because its the root
path column = ; has a depth of 0 and has 1 children its id is 1823
path column = 1823; has a depth of 1 and has 1 children its id is 1825
path column = 1823.1825; has a depth of 2 and has 1 children its id is 1826
path column = 1823.1825.1826; has a depth of 3 and has 1 children its id is 1827
path column = 1823.1825.1826.1827; has a depth of 4 and has 1 children its id is 1828
path column = 1824.1925.1955.1959.1972.1991; has a depth of 6 and has 5 children its id is 2001
path column = 1824.1925.1955.1959.1972.1991.2001; has a depth of 7 and has 1 children its id is 2141
path column = 1824.1925.1955.1959.1972.1991.2001; has a depth of 7 and has 0 children its id is 2040
path column = 1824.1925.1955.1959.1972.1991.2001; has a depth of 7 and has 1 children its id is 2054
path column = 1824.1925.1955.1959.1972.1991.2001; has a depth of 7 and has 0 children its id is 2253
path column = 1824.1925.1955.1959.1972.1991.2001; has a depth of 7 and has 1 children its id is 2166
path column = 1824.1925.1955.1959.1972.1991.2001.2054; has a depth of 8 and has 0 children its id is 2205
path column = 1824.1925.1955.1959.1972.1991.2001.2141; has a depth of 8 and has 0 children its id is 2161
path column = 1824.1925.1955.1959.1972.1991.2001.2166; has a depth of 8 and has 1 children its id is 2389
path column = 1824.1925.1955.1959.1972.1991.2001.2166.2389; has a depth of 9 and has 0 children its id is 2402
path column = 1824.1925.1983; has a depth of 3 and has 1 children its id is 2135
path column = 1824.1925.1983.2135; has a depth of 4 and has 0 children its id is 2239
path column = 1824.1926; has a depth of 2 and has 5 children its id is 1942
path column = 1824.1926; has a depth of 2 and has 11 children its id is 1928 # this is the row I am after
path column = 1824.1926; has a depth of 2 and has 2 children its id is 1933
path column = 1824.1926; has a depth of 2 and has 2 children its id is 1989
path column = 1824.1926.1928; has a depth of 3 and has 3 children its id is 2051
path column = 1824.1926.1928; has a depth of 3 and has 0 children its id is 2024
path column = 1824.1926.1928; has a depth of 3 and has 2 children its id is 1988
So, in this example, the row with id 1824 (the root) has 11 children and the row with id 1928 has 11 children with a depth of 2; this is the row I am after.
I am new to ltree and sql for that matter.
(This is a revised question with added sample data after Ltree find parent with most children postgresql was closed).
To find the node with the most children:
SELECT subpath(path, -1, 1), count(*) AS children
FROM tbl
WHERE path <> ''
GROUP BY 1
ORDER BY 2 DESC
LIMIT 1;
... and exclude root nodes:
SELECT *
FROM (
SELECT ltree2text(subpath(path, -1, 1))::int AS tbl_id, count(*) AS children
FROM tbl
WHERE path <> ''
GROUP BY 1
) ct
LEFT JOIN (
SELECT tbl_id
FROM tbl
WHERE path = ''
) x USING (tbl_id)
WHERE x.tbl_id IS NULL
ORDER BY children DESC
LIMIT 1
Assuming that root nodes have an empty ltree
(''
) as path. Might be NULL
. Then use path IS NULL
...
The winner in your example is actually 2001
, with 5 children.
Use the function subpath(...)
provided by the the additional module ltree
.
Get the last node in the path with a negative offset, which is the direct parent of the element.
Count how often that parent appears, exclude root nodes and take the remaining one with the highest count.
Use ltree2text()
to extract the value from ltree
.
If multiple nodes have equally the most children an arbitrary one is picked in the example.
This is the work I had to do to get to a useful test case (after trimming some noise):
See SQLfiddle.
In other words: please remember to provide a useful test case next time.
Answer to comment.
First, expand the test case:
ALTER TABLE tbl ADD COLUMN postal_code text
, ADD COLUMN whatever serial;
UPDATE tbl SET postal_code = (1230 + whatever)::text;
Have a look:
SELECT * FROM tbl;
Simply JOIN
result to the parent in the base table:
SELECT ct.*, t.postal_code
FROM (
SELECT ltree2text(subpath(path, -1, 1))::int AS tbl_id, count(*) AS children
FROM tbl
WHERE path <> ''
GROUP BY 1
) ct
LEFT JOIN (
SELECT tbl_id
FROM tbl
WHERE path = ''
) x USING (tbl_id)
JOIN tbl t USING (tbl_id)
WHERE x.tbl_id IS NULL
ORDER BY children DESC
LIMIT 1;