At the beginning I apologize for not being word-perfect in English.
I have two tables in my database, one contains questions, and second contains user answers for questions (for statistics).
TableA - questions
___________
| ID | Name |
TableB - Statistics
___________________________________
| ID | A_ID | U_ID| IsCorrect | Date|
User can answer one question several times, for example if we have question with ID = 1 and user (with id 2) which answered this question 4 times, we will add 4 rows to TableB:
___________________________________
| ID | A_ID | U_ID| IsCorrect | Date|
-------------------------------------
| 1 | 1 | 2 | True | Date|
| 2 | 1 | 2 | False | Date|
| 3 | 1 | 2 | False | Date|
| 4 | 1 | 2 | True | Date|
At the end, I have to query for questions (TableA) which user has not responded or answered but the smallest number of times (user is able to answer all questions).
My query (procedure) looks like:
Declare @max int
SET @max = (SELECT TOP 1 Count(A_ID) as QuestionCount FROM [TableB]
Where User_id = 1
GROUP BY A_ID
ORDER BY QuestionCount DESC)
SELECT TOP 40 ID
FROM [dbo].[TableA]
WHERE ID NOT IN (SELECT A_ID
FROM [dbo].[TableB]
WHERE User_id = 1
GROUP BY A_ID
HAVING Count(A_ID) = @max)
ORDER BY NewID()
At the beggining I query for max occurence of question - If user answered some question 4 time @max will be 4.
In second query I query for question which weren't answered yet (in this occurence).
Question is: How to optimize this query (or maybe I should change my tables)? TableB for now has almost one million rows and beacause of that it isn't fast enough.
With SQL-Server (>=2008) you can use the OVER clause (https://msdn.microsoft.com/en-us/library/ms189461.aspx) which gives you grouped aggregats.
EDIT: Just found your ORDER BY NewID()
Why do you do this? NewID() is very bad to sort... 1 million is not so much in fact, but 1 million GUIDs without an index are a mass...