I have a table
and a lot of values like
INSERT [dbo].[Exam] ([studentID], [subjectCode], [examDate], [examScore]) VALUES (N'ST01', N'PRJ', CAST(0x253C0B00 AS Date), 5)
INSERT [dbo].[Exam] ([studentID], [subjectCode], [examDate], [examScore]) VALUES (N'ST01', N'WIG', CAST(0x253C0B00 AS Date), 1)
. . .
INSERT [dbo].[Exam] ([studentID], [subjectCode], [examDate], [examScore]) VALUES (N'ST09', N'WIG', CAST(0x253C0B00 AS Date), 4)
INSERT [dbo].[Exam] ([studentID], [subjectCode], [examDate], [examScore]) VALUES (N'ST09', N'WIG', CAST(0x263C0B00 AS Date), 9)
INSERT [dbo].[Exam] ([studentID], [subjectCode], [examDate], [examScore]) VALUES (N'ST10', N'MAD', CAST(0x253C0B00 AS Date), 3)
INSERT [dbo].[Exam] ([studentID], [subjectCode], [examDate], [examScore]) VALUES (N'ST10', N'MAE', CAST(0x253C0B00 AS Date), 4)
I have to calculate all average score of students such that: (Sorry for the horrible format)
s_i is the average score of the student i^th with i=1…M where M is the total number of students N is the total number of subjects. The average score of the student s_i can be calculated by:
Where x_(i,k) is the last exam’s score of the student s_i on the subjectk^th. The last exam’s score means: student may retake the exam, and only the final result will be involved into the average score calculation. The x_(i,k) may receive Null value if student s_i did not take exam on subject k^th.
I can merely do something like
select studentID, sum(examScore)/count(subjectCode) from Exam group by studentID
How can I process forward the problem with these special cases in the second half?
As I understand your question, you want the average per student, taking into account only its latest score per subject. If so, you can use window functions for filtering, then agregation:
select studentid, avg(examscore) avgscore
from (
select e.*,
row_number() over(partition by studentid, subjectcode order by examdate desc) rn
from exam e
) e
where rn = 1
group by studentid
If you want to count "missing" subjects as 0
, then it's a bit more complicated. You would typically cross join
the list of distinct students and subjects, then bring the table with a left join
:
select st.studentid, avg(coalesce(e.examscore, 0)) avgscore
from (select distinct studentid from exam) st
cross join (select distinct subjectcode from exam) su
left join (
select e.*,
row_number() over(partition by studentid, subjectcode order by examdate desc) rn
from exam e
) e
on e.studentid = st.studentid
and e.subjectcode = su.subjectcode
and e.rn = 1
group by st.studentid
In a real life situation, you would probably have separate referential tables to store the students and subjects, which you would use instead of the select distinct
_ subqueries.