I have a table that stores hierarchical data, where each record may have a parent_item_code referencing another record in the same table. I'm using a Common Table Expression (CTE) to build the full path of each item, concatenating parent names until reaching the final items (leaves).
Here is the query I’m using:
WITH Hierarchy AS (
-- Select root records (those that do not have a parent)
SELECT
i.item_code,
i.name,
i.parent_item_code,
CAST(i.name AS VARCHAR(1000)) AS hierarchy
FROM sgt_consulta.items i
WHERE i.item_type = 'C'
AND i.status = 'A'
AND i.parent_item_code IS NULL -- Only root records
UNION ALL
-- For each node found, retrieve its children and concatenate the hierarchy
SELECT
c.item_code,
c.name,
c.parent_item_code,
CAST(h.hierarchy + ' > ' + c.name AS VARCHAR(1000)) AS hierarchy
FROM sgt_consulta.items c
INNER JOIN Hierarchy h ON c.parent_item_code = h.item_code
WHERE c.item_type = 'C'
AND c.status = 'A'
)
-- Select only leaf records (those that have no children)
SELECT
h.item_code,
h.name,
h.hierarchy
FROM Hierarchy h
LEFT JOIN sgt_consulta.items i ON h.item_code = i.parent_item_code
WHERE i.parent_item_code IS NULL; -- Only final records
The query works correctly for most records, but I noticed that some records are missing from the result even though:
They have a complete hierarchy, meaning their ancestors are correctly structured. They do not have NULL values in any important fields (item_code, parent_item_code, name, etc.). They should be identified as leaf records, but they are being excluded. I suspect that the issue may be related to the final LEFT JOIN condition:
LEFT JOIN sgt_consulta.items i ON h.item_code = i.parent_item_code
WHERE i.parent_item_code IS NULL; -- Only final records
If I remove this condition, the missing records appear again, but then I get non-final items in the results, which I want to avoid.
What could be causing this behavior? Is there a more reliable way to filter only leaf nodes without excluding valid records?
Here are some examples of working records and the one I'm having trouble specifically (could only find this one so far. Don't mind the names in portuguese, they shouldn't affect the recreation):
Records being returned correctly (first row is the leaf node):
item_code | name | parent_item_code | item_type | status |
---|---|---|---|---|
261 | Carta Precatória Cível | 257 | C | A |
257 | Cartas | 214 | C | A |
214 | Outros procedimentos | 2 | C | A |
2 | PROCESSO CÍVEL E DO TRABALHO | NULL | C | A |
item_code | name | parent_item_code | item_type | status |
---|---|---|---|---|
1729 | Agravo Interno Criminal | 412 | C | A |
412 | Recursos | 268 | C | A |
268 | PROCESSO CRIMINAL | NULL | C | A |
Troublesome record (first row is the leaf node being excluded):
item_code | name | parent_item_code | item_type | status |
---|---|---|---|---|
7 | Procedimento Comum Cível | 1107 | C | A |
1107 | Procedimento de conhecimento | 1106 | C | A |
1106 | Processo de conhecimento | 2 | C | A |
2 | PROCESSO CÍVEL E DO TRABALHO | NULL | C | A |
Adding the answer as requested. On the previous to last line, that ON
predicate is wrong. You need to include the same filters you are using when walking the tree. That is:
ON h.item_code = i.parent_item_code AND i.item_type = 'C' AND i.status = 'A'
The whole query will look like:
WITH Hierarchy AS (
-- Select root records (those that do not have a parent)
SELECT
i.item_code,
i.name,
i.parent_item_code,
CAST(i.name AS VARCHAR(1000)) AS hierarchy
FROM sgt_consulta.items i
WHERE i.item_type = 'C'
AND i.status = 'A'
AND i.parent_item_code IS NULL -- Only root records
UNION ALL
-- For each node found, retrieve its children and concatenate the hierarchy
SELECT
c.item_code,
c.name,
c.parent_item_code,
CAST(h.hierarchy + ' > ' + c.name AS VARCHAR(1000)) AS hierarchy
FROM sgt_consulta.items c
INNER JOIN Hierarchy h ON c.parent_item_code = h.item_code
WHERE c.item_type = 'C'
AND c.status = 'A'
)
-- Select only leaf records (those that have no children)
SELECT
h.item_code,
h.name,
h.hierarchy
FROM Hierarchy h
LEFT JOIN sgt_consulta.items i ON h.item_code = i.parent_item_code
AND i.item_type = 'C' AND i.status = 'A'
WHERE i.parent_item_code IS NULL; -- Only final records