Search code examples
sqlsqliterecursive-query

recursive SQLite SELECT: can’t wrap my head around it


I have this SQLite table:

CREATE TABLE seed (
  code TEXT PRIMARY KEY,
  name TEXT,
  grams_per_seed REAL,
  mother TEXT REFERENCES seed (code) DEFERRABLE INITIALLY DEFERRED,
  notes TEXT,
  mother_notes TEXT
) STRICT;

containing data including this:

sqlite> select code, mother from seed
   ...> where code in ("RP20-3", "RP19-1", "BC18-MG", "BC18-TI");
code     mother
-------  -------
BC18-MG
BC18-TI
RP19-1   BC18-TI
RP20-3   RP19-1

Note the ancestry chain that goes RP20-3RP19-1BC18-TI. There are many of these in the real data.

What I want is a query that maps the seed codes to the length of their ancestry chains. For example:

sqlite> select [... something ...] from [... something else ...];
code     a_ct
-------  ----
BC18-MG     0
BC18-TI     0
RP19-1      1
RP20-3      2

I’ve gotten partway there with this kludge, but I know it’s not right because the maximum chain length is encoded in the join width:

sqlite> select child.code, parent.code, grand.code
   ...> from seed as child, seed as parent, seed as grand
   ...> where     child.code = "RP20-3"
   ...>       and child.mother = parent.code
   ...>       and parent.mother = grand.code;
code    code    code
------  ------  -------
RP20-3  RP19-1  BC18-TI

I’m pretty sure this requires a recursive query, but I haven’t been able to get it to work. I’ve tried several queries designed to return one row per member of the chain that I can count, but I get either no rows or an infinite number of them.

This question is very similar to what I want, but it returns ancestors only for a single item, rather than all items.

I’ve found a number of examples of recursive queries, including parent-child examples. It seems like this should be relatively straightforward, but I’m getting stuck somewhere. I do have a good handle on recursion in procedural languages. What am I missing?


Solution

  • It may be useful, examples with recursive queries

    with recursive r as (
     select code chainhead,1 chainnum,code,mother from seed s1 
     where not exists(select * from seed s2 where s2.code=s1.mother)
     union all 
     select chainhead ,chainnum+1 n,s.code,s.mother
     from r inner join seed s on s.mother=r.code
     )
    select chainhead
      ,min(case when chainnum=1 then code else null end) link0 
      ,min(case when chainnum=2 then code else null end) link2 
      ,min(case when chainnum=3 then code else null end) link3 
      ,min(case when chainnum=4 then code else null end) link4 
    from r
    group by chainhead;
    
    chainhead link0 link2 link3 link4
    BC18-MG BC18-MG (null) (null) (null)
    BC18-TI BC18-TI RP19-1 RP20-3 (null)

    You may add as many as needed columns "linkN". In all cases column count is fixed.

    In next example, accumulate chain in text. Limitation is recursion depth, or max TEXT size, not fixed.

    with recursive rs as (
     select code chainhead,1 linknum,code,mother, code path from seed s1 
     where not exists(select * from seed s2 where s2.code=s1.mother)
     union all 
     select chainhead ,linknum+1 n,s.code,s.mother,r.path||','||s.code
     from rs r inner join seed s on s.mother=r.code
     )
    select * from rs;
    
    chainhead linknum code mother path
    BC18-MG 1 BC18-MG BC18-MG
    BC18-TI 1 BC18-TI BC18-TI
    BC18-TI 2 RP19-1 BC18-TI BC18-TI,RP19-1
    BC18-TI 3 RP20-3 RP19-1 BC18-TI,RP19-1,RP20-3