Search code examples
postgresqlrecursionrecursive-query

Recursive query without duplicating the non-recursive term


I have hierarchy data where each record belongs to a parent group, and need to retreive all records plus their parent record, but do not wish to duplicate a record if it already was included in the initial non-recursive term.

For example, the following query returns the first six records where provided by the non-recursive term, and then the recursive term returns the next five records for the parents, however three of the parent records (child_id 4528, 4539 and 4541) already were returned by the initial non-recursive term and should not be duplicated.

I tried using DISTINCT, adding cs.id!=t.child_id (no effect?) to the WHERE clause, using t in a subquery (evidently not possible) with no success.

How can this query be modified to not return duplicated rows?

facdocs=> WITH RECURSIVE t(id, child_id, parent_id) AS (
    SELECT s.id, cs.id child_id, cs.parent_id, cs.name, cs.spec
    FROM project p
    INNER JOIN specification s ON s.project_id=p.id
    INNER JOIN csi_spec cs ON cs.id=s.csi_spec_id
    WHERE p.id = 1
    UNION
    SELECT null id, cs.id child_id, cs.parent_id, cs.name, cs.spec
    FROM t, csi_spec cs
WHERE cs.id=t.parent_id AND cs.id!=t.child_id AND cs.id
)
SELECT * FROM t;

 id | child_id | parent_id |                        name                        |    spec
----+----------+-----------+----------------------------------------------------+-------------
  7 |     4543 |      4541 | Medium-Voltage, Single- and Multi-Conductor Cables | 26 05 13.16
  8 |     4528 |         2 | Electrical                                         | 26 00 00
  9 |     4539 |      4528 | Common Work Results for Electrical                 | 26 05 00
 11 |     4541 |      4539 | Medium-Voltage Cables                              | 26 05 13
 12 |     4542 |      4541 | Medium-Voltage Open Conductors                     | 26 05 13.13
 13 |     4578 |      4573 | Wiring Device Schedule                             | 26 06 20.26
    |     4528 |         2 | Electrical                                         | 26 00 00
    |     4539 |      4528 | Common Work Results for Electrical                 | 26 05 00
    |     4541 |      4539 | Medium-Voltage Cables                              | 26 05 13
    |     4573 |      4571 | Schedules for Low-Voltage Electrical Distribution  | 26 06 20
    |     4571 |      4528 | Schedules for Electrical                           | 26 06 00
(11 rows)

For refence only, below shows the tables:

facdocs=> \d csi_spec
                       Table "public.csi_spec"
  Column   |          Type          | Collation | Nullable | Default
-----------+------------------------+-----------+----------+---------
 id        | integer                |           | not null |
 parent_id | integer                |           | not null |
 name      | character varying(255) |           | not null |
 spec      | character varying(255) |           | not null |
 div       | character varying(2)   |           | not null |
 section   | character varying(2)   |           | not null |
 scope     | character varying(2)   |           | not null |
 subscope  | character varying(2)   |           | not null |
Indexes:
    "csi_spec_pkey" PRIMARY KEY, btree (id)
    "idx_unique_spec" UNIQUE, btree (div, section, scope, subscope)
    "uniq_92347c2dc00e173e" UNIQUE, btree (spec)
    "idx_92347c2d727aca70" btree (parent_id)
    "idx_div" btree (div)
    "idx_scope" btree (subscope)
    "idx_section" btree (section)
Foreign-key constraints:
    "fk_92347c2d727aca70" FOREIGN KEY (parent_id) REFERENCES csi_spec(id)
Referenced by:
    TABLE "csi_spec" CONSTRAINT "fk_92347c2d727aca70" FOREIGN KEY (parent_id) REFERENCES csi_spec(id)
    TABLE "specification" CONSTRAINT "fk_e3f1a9af4ead25e" FOREIGN KEY (csi_spec_id) REFERENCES csi_spec(id)

facdocs=> \d specification
                         Table "public.specification"
   Column    |              Type              | Collation | Nullable | Default
-------------+--------------------------------+-----------+----------+---------
 id          | integer                        |           | not null |
 project_id  | integer                        |           | not null |
 owner_id    | integer                        |           | not null |
 csi_spec_id | integer                        |           | not null |
 create_at   | timestamp(0) without time zone |           | not null |
 notes       | text                           |           |          |
Indexes:
    "specification_pkey" PRIMARY KEY, btree (id)
    "idx_e3f1a9a166d1f9c" btree (project_id)
    "idx_e3f1a9a7e3c61f9" btree (owner_id)
    "idx_e3f1a9af4ead25e" btree (csi_spec_id)
Foreign-key constraints:
    "fk_e3f1a9a166d1f9c" FOREIGN KEY (project_id) REFERENCES project(id)
    "fk_e3f1a9a7e3c61f9" FOREIGN KEY (owner_id) REFERENCES "user"(id)
    "fk_e3f1a9af4ead25e" FOREIGN KEY (csi_spec_id) REFERENCES csi_spec(id)
Referenced by:
    TABLE "document" CONSTRAINT "fk_d8698a76908e2ffe" FOREIGN KEY (specification_id) REFERENCES specification(id)

facdocs=> \d project
                                         Table "public.project"
   Column    |              Type              | Collation | Nullable |              Default
-------------+--------------------------------+-----------+----------+-----------------------------------
 id          | integer                        |           | not null |
 account_id  | integer                        |           | not null |
 name        | character varying(255)         |           | not null |
 project_id  | character varying(255)         |           | not null |
 create_at   | timestamp(0) without time zone |           | not null |
 is_active   | boolean                        |           | not null |
 start_at    | timestamp(0) without time zone |           |          | NULL::timestamp without time zone
 description | text                           |           |          |
Indexes:
    "project_pkey" PRIMARY KEY, btree (id)
    "name_unique" UNIQUE, btree (name, account_id)
    "project_id_unique" UNIQUE, btree (project_id, account_id)
    "idx_2fb3d0ee9b6b5fba" btree (account_id)
Foreign-key constraints:
    "fk_2fb3d0ee9b6b5fba" FOREIGN KEY (account_id) REFERENCES account(id)
Referenced by:
    TABLE "project_vendor" CONSTRAINT "fk_e286d8bc166d1f9c" FOREIGN KEY (project_id) REFERENCES project(id) ON DELETE CASCADE
    TABLE "specification" CONSTRAINT "fk_e3f1a9a166d1f9c" FOREIGN KEY (project_id) REFERENCES project(id)

facdocs=>

Solution

  • A simple fix is to exclude the duplicates in the main query:

    WITH RECURSIVE t (...)
    SELECT DISTINCT ON (child_id) *
    FROM t
    ORDER BY child_id, id;
    

    This should work, because DISTINCT ON will return the first row per client_id and NULL sorts at the end by default.