I have a query as:
WITH Top_Areas AS
(
SELECT AIP.aid [Author_ID], CAST(P.research_area as VARCHAR(100)) [Research_Area],
P.research_area_category_id [Category],
COUNT(*) [Paper_Count],
P.p_year [Paper_Year]
FROM coauthor_individual_papers AIP
JOIN sub_aminer_paper P ON AIP.pid = P.pid
WHERE AIP.aid IN (901706) AND AIP.p_year IN (2005)
GROUP BY
AIP.aid, CAST(P.research_area as VARCHAR(100)),
P.research_area_category_id,
P.p_year
),
Top_Author_Areas AS
(SELECT *, ROW_NUMBER() OVER(PARTITION BY Author_ID, Paper_Year ORDER BY Paper_Count DESC, Research_Area ASC) AS Rank FROM Top_Areas)
SELECT *
FROM CTE1
WHERE Rank <= 5
which gives output as:
Author_ID Research_Area Category Paper_Count Paper_Year Rank
--------------------------------------------------------------------
901706 feature 8 1 2005 1
901706 image annotate 11 1 2005 2
901706 image db 11 1 2005 3
901706 retrieval model 12 1 2005 4
This output is just ORDERING BY Research_Area
i.e. ASC
whereas I also need ORDERING BY Paper_Count
i.e. DESC
, whereas the desired output should be like:
Author_ID Category Paper_Count Paper_Year Rank
----------------------------------------------------
901706 11 2 2005 1
901706 8 1 2005 2
901706 12 1 2005 3
It is possible when we not to select Research_Area
in SELECT
list, but if we not select Research_Area
, then how to ORDER BY Research_Area ASC
, whereas we want to use ORDER BY Research_Area ASC
option when Paper_Count
value is same as shown in desired output above i.e.
Category
11 is at Rank
1 because of highest value of Paper_Count
,
Category
8 and 11 are at Rank
2 and 3 respectively because of following reasons:
Paper_Count
value is same.Rank
2 and 3 as by Research_Area ASC
How can we get desired output?
Since you still don't seem to get my point about there being possibly different Research_Area
values when we're trying to implement your tie-breaking rule, I'll decide arbitrarily - we'll use the earliest Research_Area
value within each Category
to do the tie-breaking:
WITH Top_Areas AS
(
SELECT AIP.aid [Author_ID],
MIN(CAST(P.research_area as VARCHAR(100))) [Research_Area],
P.research_area_category_id [Category],
COUNT(*) [Paper_Count],
P.p_year [Paper_Year]
FROM coauthor_individual_papers AIP
JOIN sub_aminer_paper P ON AIP.pid = P.pid
WHERE AIP.aid IN (901706) AND AIP.p_year IN (2005)
GROUP BY
AIP.aid,
--CAST(P.research_area as VARCHAR(100)),
P.research_area_category_id,
P.p_year
),
Top_Author_Areas AS
(SELECT *, ROW_NUMBER() OVER(PARTITION BY Author_ID, Paper_Year
ORDER BY Paper_Count DESC, Research_Area ASC) AS Rank
FROM Top_Areas)
SELECT *
FROM CTE1
WHERE Rank <= 5
Changed areas are on lines separated from rest of the query by blank lines - I'm just applying MIN
to Research_Area
in order to obtain a single value for use in the tie-breaking, and have removed it from the GROUP BY
, both changes within the Top_Areas
CTE.