Search code examples
sqlsql-serversql-server-ce

How to find non-matching nth child rows by name?


I am trying to write a query to find the nth child records only where the nth child records do not match in name.

Simplified Data Structure:

  • Root (PK: id)
    • Job (PK: id, FK: idRoot -> Root.id)
      • Task (PK: id, FK: idParent -> Job.id, FK: idRoot -> Root.id)

All ID fields are strings (I'm at the mercy of the current design, it's not ours to change)

The idea is that a sub-query will get only one match per root, and those will be evaluated against the other single match from the other roots. Unless all of the results from the subquery match (on a specified row), they all fail. Basically I need all nth tasks to have the same name across all of the selected roots, otherwise they all fail.

I have a query that gives me each nth child record per root (where Task is the child table). However I can't figure out how to check for equality. Additionally this is required to work for both SQL Compact and SQL Server, which limits some functions.

This is for n = 1 (first task)

SELECT COUNT(r.id) as RowIndex, t.id, t.idRoot, t.Name
FROM Task t
  INNER JOIN Job j ON j.id = t.idParent
  INNER JOIN (
    SELECT b.id, b.idParent, b.idRoot, b.StartDate
    FROM Task b
      INNER JOIN Job c ON b.idParent = c.idRec
  ) r ON t.StartDate >= r.StartDate
      AND r.idParent = j.id 
      AND t.idRoot = r.idRoot
WHERE t.idRoot IN ('1', '2', '3', '4')
  AND j.Type LIKE '%Example%'
GROUP BY t.idRoot, t.id, t.Name
HAVING COUNT(r.id) = 1    

That gives me back something like:

╔══════════╦════╦════════╦════════════╗
║ RowIndex ║ ID ║ IDRoot ║    Name    ║
╠══════════╬════╬════════╬════════════╣
║        1 ║  4 ║      2 ║ 1st Task   ║
║        1 ║  7 ║      3 ║ 1st Task   ║
║        1 ║ 11 ║      4 ║ First Task ║
╚══════════╩════╩════════╩════════════╝

So in the above results I'm correctly getting back the first task from 'Example' job types. I can't figure out how to get results only when the Names didn't all match. If each of the names was '1st Task', the query should return back nothing.

UPDATE:

Here's a link to pastebin that can create the tables and load sample data.

Query A returns the first task for job type A, and the results show the task names don't match. Therefore I'd want both of these task IDs to be returned.

Query B returns the first tasks for job type B, and the results show all task names match. Therefore I'd want none of these task IDs returned.


Solution

  • Try using the following query:

    
    SELECT 
    r.ID 
    , j.TYPE 
    , t.Name 
    , t.StartDate 
    INTO #CoreData 
    FROM root r 
    INNER JOIN job j 
    ON j.idRoot = r.id 
    INNER JOIN task t 
    ON t.idParent = j.id 
    AND t.idRoot = r.id 
    
    SELECT 
    c.ID 
    , c.Name 
    , c.StartDate 
    , c.TYPE 
    , x.RowID 
    INTO #CoreDataWithRowID 
    FROM #CoreData c 
    CROSS APPLY ( 
    SELECT 
    COUNT(*) AS RowID 
    FROM #CoreData cd 
    WHERE cd.TYPE = c.TYPE 
    AND cd.ID = c.ID 
    AND cd.StartDate <= c.StartDate 
    ) x 
    
    
    SELECT 
    oa.* 
    FROM #CoreDataWithRowID oa 
    INNER JOIN #CoreDataWithRowID ob 
    ON oa.TYPE = ob.TYPE 
    AND oa.RowID = ob.RowID 
    AND oa.ID <= ob.ID 
    WHERE oa.Name != ob.Name 
    UNION 
    SELECT 
    oa.* 
    FROM #CoreDataWithRowID oa 
    INNER JOIN #CoreDataWithRowID ob 
    ON oa.TYPE = ob.TYPE 
    AND oa.RowID = ob.RowID 
    AND oa.ID >= ob.ID 
    WHERE oa.Name != ob.Name
    

    This will hopefully work around some of the limitations caused by SQL Server Compact Edition.