Search code examples
sql-servert-sql

T-SQL calculate percentage of count group by date


I have a table that has the following structure and trying to write a T-SQL to calculate percentage of the counts. The percentage for each row should be the percent of the total for one of the grouping which in my case is the date.

CREATE TABLE ProductSales
(
    Id int IDENTITY(1,1) NOT NULL,
    AccountId int NOT NULL,
    SaleDate date NOT NULL,
    ProductType varchar(50) NOT NULL,
    ProductSubType varchar(50) NOT NULL,
    Amount money NOT NULL,

    CONSTRAINT PK_ProductSales 
        PRIMARY KEY CLUSTERED (Id ASC)
                WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, 
                      IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, 
                      ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON PRIMARY
) ON PRIMARY
GO

INSERT INTO ProductSales (AccountId, SaleDate, ProductType, ProductSubType, Amount) 
VALUES  (1, CAST(N'2023-11-01' AS Date), N'Books', N'Comics', 12.0000),
        (2, CAST(N'2023-11-01' AS Date), N'Books', N'Comics', 22.0000),
        (3, CAST(N'2023-11-01' AS Date), N'Clothing', N'Polos', 2.0000),
        (4, CAST(N'2023-11-01' AS Date), N'Books', N'Fonics', 18.0000),
        (5, CAST(N'2023-11-02' AS Date), N'Clothing', N'Jackets', 22.0000),
        (5, CAST(N'2023-11-02' AS Date), N'Clothing', N'Polos', 18.0000),
        (2, CAST(N'2023-11-02' AS Date), N'Clothing', N'Jeans', 19.0000),
        (1, CAST(N'2023-11-02' AS Date), N'Clothing', N'Jackets', 187.0000)

I would like the last column as a percentage of total rows for each day as below. The calculation of percentage was done manually and not using T-SQL.

SaleDate    ProductType ProductSubType  Total   Percentage
----------------------------------------------------------
11/1/2023   Books   Comics  2   50.00%
11/1/2023   Books   Fonics  1   25.00%
11/1/2023   Clothing    Polos   1   25.00%

11/2/2023   Books   Comics  2   33.33%
11/2/2023   Clothing    Jackets 2   33.33%
11/2/2023   Clothing    Jeans   1   16.67%
11/2/2023   Clothing    Polos   1   16.67%

Here is my query that I tried. But the percentage is the total of all the rows. I need that by date.

DECLARE @startdt date = '2023-11-01', 
        @enddt date = '2023-12-13'

SELECT
    CAST(SaleDate as date) SaleDate, ProductType, ProductSubType, 
    COUNT(1) Total,
    (100.0 * COUNT(1) / (SELECT COUNT(1) FROM ProductSales 
                         WHERE SaleDate BETWEEN @startdt AND @enddt)) AS 'Percentage'
FROM
    ProductSales 
WHERE
    SaleDate BETWEEN @startdt AND @enddt
GROUP BY 
    CAST(SaleDate AS date), ProductType, ProductSubType
ORDER BY
    SaleDate, ProductType

Solution

  • You need to correlate the subquery table with the main one, by doing WHERE innerTable.date = outerTable.date.

    SELECT
        SaleDate, ProductType, ProductSubType, 
        COUNT(*) Total,
        (100.0 * COUNT(*) / (SELECT COUNT(*) FROM ProductSales ps2
                             WHERE ps2.SaleDate = ps.SaleDate
                           --  group by ps.SaleDate
                             )) AS 'Percentage'
    FROM
        ProductSales ps
    WHERE
        SaleDate BETWEEN @startdt AND @enddt
    GROUP BY 
        SaleDate, ProductType, ProductSubType
    ORDER BY
        SaleDate, ProductType
    

    Watch out for group by columns there.

    But you can also bin the above and just use the windowed aggregation to avoid the subquery altogether:

    SELECT
        SaleDate, ProductType, ProductSubType, 
        COUNT(*) Total,
        100.0 * COUNT(*) / SUM(COUNT(*)) OVER(PARTITION BY SaleDate) AS 'Percentage'
    FROM
        ProductSales ps
    WHERE
        SaleDate BETWEEN @startdt AND @enddt
    GROUP BY 
        SaleDate, ProductType, ProductSubType
    ORDER BY
        SaleDate, ProductType
    

    SUM(COUNT(*)) OVER(PARTITION BY SaleDate) summarizes the counts and generates the total sum which you can use to get the percentage