Search code examples
sqlitesequencecommon-table-expressionmissing-data

Finding missing records and missing sequences in SQLite table - where the first entry in the sequence must be 1


Refer to Identifying missing sequences in SQLite table for the original problem and solution.

The following code correctly identifies missing sequences, however, its starting point is the lowest number in the sequence based on the data in the table. So for example, if the sequence is supposed to start at 1, but the first sequence in the data begins with 5, then 1..4 are not included in the result.

Any tips on how to test using a predefined start sequence?

SELECT dir, start, stop 
FROM (
  SELECT m.id + 1 start,
    (SELECT MIN(id) - 1 FROM sequence x WHERE x.__dirpath = m.__dirpath AND x.id > m.id) stop, 
     m.__dirpath dir
  FROM sequence m LEFT JOIN sequence r 
  ON m.__dirpath = r.__dirpath AND m.id = r.id - 1
  WHERE r.id IS NULL
) 
WHERE stop IS NOT NULL 
ORDER BY dir, start, stop;

Solution

  • All you have to do is include in the dataset 1 row for each __dirpath with id equal to 0, provided that such a row does not already exist in the table, since the min possible id is 1.
    You can do that with UNION ALL inside a CTE.
    Then use that CTE in your query:

    WITH cte AS (
      SELECT __dirpath, id FROM sequence
      UNION ALL
      SELECT DISTINCT __dirpath, '0' FROM sequence 
    )
    SELECT dir, start, stop 
    FROM (
      SELECT m.id + 1 start,
        (SELECT MIN(id) - 1 FROM cte x WHERE x.__dirpath = m.__dirpath AND x.id > m.id) stop, 
         m.__dirpath dir
      FROM cte m LEFT JOIN cte r 
      ON m.__dirpath = r.__dirpath AND m.id = r.id - 1
      WHERE r.id IS NULL
    ) 
    WHERE stop IS NOT NULL 
    ORDER BY dir, start, stop;
    

    See the demo.