Search code examples
sql-servert-sqlcommon-table-expression

SQL removing duplicate rows while returning their count


I have a table with two columns, I want to count the distinct values on Col_A over and also return distinct values of Col_A(while leaving the rest of the columns in tact there in response)

MyTable

Col_A | Col_B 
------+-------
A     | 1 
A     | 1 
A     | 2 
A     | 3 
b     | 4 
b     | 4 
b     | 5 

Expected result

Col_A   | Col_B | count
--------+-------+-------
A       | 1     | 4 
b       | 4     | 3 

I tried with this code

SELECT
    *,  
    ROW_NUMBER() OVER (PARTITION BY Col_A ORDER BY Col_A)  
FROM
    MyTable

It is not working (Col_B values are irreverent).

I have taken care of the item to be unique (just the first entry is selected from group of items with same Col_A). However I also have to count the number of rows where the Col_A is the same.

[EDIT]

the complete code in the CTE is as follows:-

WITH PagedReports AS (      
        SELECT      
        *, ROW_NUMBER() OVER (ORDER BY rm.ReportID) AS RowNumPaging  ,    
            ROW_NUMBER() OVER (partition by rm.ReportID ORDER BY rm.ReportName) AS RowNum     
        FROM      
            [dbo].[UserInfo] ui      
        INNER JOIN      
            [dbo].[UserReportMap] urm ON ui.SecureDbID = CONVERT(VARCHAR(50), urm.UserID)      
        INNER JOIN      
            [dbo].[ReportMaster] rm ON urm.ReportID = rm.ID      
        WHERE      
            ui.SecureDbID = 18          
    )      
    SELECT      
        ReportID,      
        ReportName,      
        Status,      
        ReportDate,      
        approvedAmount,      
        claimedAmount,      
        currency,      
        NoOfBills,      
        TotalAmount,  
        BillApprovalStatus          
    FROM (      
        SELECT      
            ReportID,      
            ReportName,      
            Status,      
            ReportDate,      
            approvedAmount,      
            claimedAmount,      
            currency,      
            COUNT(*) OVER (PARTITION BY ReportID) AS NoOfBills,      
            SUM(claimedAmount) OVER (PARTITION BY ReportID) AS TotalAmount,  
            BillApprovalStatus,  
            RowNum,    
            RowNumPaging    
        FROM PagedReports      
        WHERE RowNumPaging BETWEEN 1 AND 100            
        )       
        
        AS PagedResults     
        where NoOfBills=1; 

Solution

  • I'm assuming that a general aggregate like:

    select A, MIN(B), COUNT(*)
    FROM yourtable
    GROUP BY A
    

    doesn't work here, for example if you have more than 2 columns.

    The way to do it is then is a window aggregate + row_number, something like:

    SELECT  *
    FROM    (
        SELECT  *
        ,   ROW_NUMBER() OVER(PARTITION BY Col_A ORDER BY Col_B) AS sort
        ,   COUNT(*) OVER(PARTITION BY Col_A) AS cnt
        FROM    (
            VALUES  (N'A', 1)
            ,   (N'A', 1)
            ,   (N'A', 2)
            ,   (N'A', 3)
            ,   (N'b', 4)
            ,   (N'b', 4)
            ,   (N'b', 5)
        ) t (Col_A,Col_B)
        ) x
    WHERE   x.sort = 1