Search code examples
sqlsql-serversql-server-2008row-number

Filtering on ROW_NUMBER() is changing the results


I did implement an OData service of my own that takes an SQL statement and apply the top / skip filter using a ROW_NUMBER(). Most statement tested so far are working well except for a statement involving 2 levels of Left Join. For some reason I can't explain, the data returned by the sql is changing when I apply a where clause on the row number column.

For readability (and testing), I removed most of the sql to keep only the faulty part. Basically, you have a Patients table that may have 0 to N Diagnostics and the Diagnostics may have 0 to N Treatments:

SELECT RowNumber, PatientID, DiagnosticID, TreatmentID 
FROM 
(
    SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS RowNumber
         , *
    FROM PATIENTS
    LEFT JOIN DIAGNOSTICS ON DIAGNOSTICS.PatientID = PATIENTS.PatientID
    LEFT JOIN TREATMENTS ON TREATMENTS.DiagnosticID = DIAGNOSTICS.DiagnosticID
) AS Wrapper
--WHERE RowNumber BETWEEN 1 AND 10
--If I uncomment the line above, I'll get 10 lines that differs from the first 10 line of this query

This is the results I got from the statement above. The result on the left is showing the first 10 rows without the WHERE clause while the one on the right is showing the results with the WHERE clause.

Results

For the record, I'm using SQL Server 2008 R2 SP3. My application is in C# but the problem occurs in SQL server too so I don't think .NET is involved in this case.

EDIT

About the ORDER BY (SELECT NULL), I took that code a while ago from this SO question. However, an order by null will work only if the statement is sorted... in my case, I forgot about adding an order by clause so that's why I was getting some random sorting.


Solution

  • Let me first ask: why do you expect it to be the same? Or rather, why do you expect it to be anything in particular? You haven't imposed an ordering, so the query optimizer is free to use whatever execution operators are most efficient (according to its cost scheme). When you add the WHERE clause, the plan will change and the natural ordering of the results will be different. This can also happen when adding joins or subqueries, for example.

    If you want the results to come back in a specific order, you need to actually use the ORDER BY subclause of the ROW_NUMBER() window function. I'm not sure why you are ordering by SELECT NULL, but I can guarantee you that's the problem.