I have a single SQLite table that has the following fields representing metadata pulled from individual files stored on disk. There's a record for every file:
__path denotes the full path and filename (in effect the PK)
__dirpath denotes the directory path excluding the filename
__dirname denotes the directory name in which the file is found
refid denotes an attribute of interest, pulled from the underlying file on disk
My query for identifying offending folders is as follows:
SELECT __dirpath
FROM (
SELECT DISTINCT __dirpath,
__dirname,
refid
FROM source
)
GROUP BY __dirpath
HAVING count( * ) > 1
ORDER BY __dirpath, __dirname;
Is it possible to iterate through a query's results and use each result as the input into another query without resorting to using something like Python alongside SQLite? As an example, to see records belonging to failed sets:
SELECT __dirpath, refid
FROM source
WHERE __dirpath = <nth result from aforementioned query>;
If you want all offending rows, one option is:
select t.*
from (
select t.*,
min(refid) over(partition by __dirpath, __dirname) as min_refid,
max(refid) over(partition by __dirpath, __dirname) as max_refid
from mytable t
) t
where min_refid <> max_refid
The logic is to compare the minimum and maximum refid
over each group of rows having the same dir path and dir name. If they differ, then the row is offending.
We could also use exists
- which would better handle possible null
values in refid
:
select t.*
from mytable t
where exists (
select 1
from mytable t1
where
t1.__dirpath = t.__dirpath
and t1.__dirname = t.__dirname
and t1.ref_id is not t.ref_id
)