I have a table that has the following structure and trying to write a T-SQL to calculate percentage of the counts. The percentage for each row should be the percent of the total for one of the grouping which in my case is the date.
CREATE TABLE ProductSales
(
Id int IDENTITY(1,1) NOT NULL,
AccountId int NOT NULL,
SaleDate date NOT NULL,
ProductType varchar(50) NOT NULL,
ProductSubType varchar(50) NOT NULL,
Amount money NOT NULL,
CONSTRAINT PK_ProductSales
PRIMARY KEY CLUSTERED (Id ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON PRIMARY
) ON PRIMARY
GO
INSERT INTO ProductSales (AccountId, SaleDate, ProductType, ProductSubType, Amount)
VALUES (1, CAST(N'2023-11-01' AS Date), N'Books', N'Comics', 12.0000),
(2, CAST(N'2023-11-01' AS Date), N'Books', N'Comics', 22.0000),
(3, CAST(N'2023-11-01' AS Date), N'Clothing', N'Polos', 2.0000),
(4, CAST(N'2023-11-01' AS Date), N'Books', N'Fonics', 18.0000),
(5, CAST(N'2023-11-02' AS Date), N'Clothing', N'Jackets', 22.0000),
(5, CAST(N'2023-11-02' AS Date), N'Clothing', N'Polos', 18.0000),
(2, CAST(N'2023-11-02' AS Date), N'Clothing', N'Jeans', 19.0000),
(1, CAST(N'2023-11-02' AS Date), N'Clothing', N'Jackets', 187.0000)
I would like the last column as a percentage of total rows for each day as below. The calculation of percentage was done manually and not using T-SQL.
SaleDate ProductType ProductSubType Total Percentage
----------------------------------------------------------
11/1/2023 Books Comics 2 50.00%
11/1/2023 Books Fonics 1 25.00%
11/1/2023 Clothing Polos 1 25.00%
11/2/2023 Books Comics 2 33.33%
11/2/2023 Clothing Jackets 2 33.33%
11/2/2023 Clothing Jeans 1 16.67%
11/2/2023 Clothing Polos 1 16.67%
Here is my query that I tried. But the percentage is the total of all the rows. I need that by date.
DECLARE @startdt date = '2023-11-01',
@enddt date = '2023-12-13'
SELECT
CAST(SaleDate as date) SaleDate, ProductType, ProductSubType,
COUNT(1) Total,
(100.0 * COUNT(1) / (SELECT COUNT(1) FROM ProductSales
WHERE SaleDate BETWEEN @startdt AND @enddt)) AS 'Percentage'
FROM
ProductSales
WHERE
SaleDate BETWEEN @startdt AND @enddt
GROUP BY
CAST(SaleDate AS date), ProductType, ProductSubType
ORDER BY
SaleDate, ProductType
You need to correlate the subquery table with the main one, by doing WHERE innerTable.date = outerTable.date.
SELECT
SaleDate, ProductType, ProductSubType,
COUNT(*) Total,
(100.0 * COUNT(*) / (SELECT COUNT(*) FROM ProductSales ps2
WHERE ps2.SaleDate = ps.SaleDate
-- group by ps.SaleDate
)) AS 'Percentage'
FROM
ProductSales ps
WHERE
SaleDate BETWEEN @startdt AND @enddt
GROUP BY
SaleDate, ProductType, ProductSubType
ORDER BY
SaleDate, ProductType
Watch out for group by columns there.
But you can also bin the above and just use the windowed aggregation to avoid the subquery altogether:
SELECT
SaleDate, ProductType, ProductSubType,
COUNT(*) Total,
100.0 * COUNT(*) / SUM(COUNT(*)) OVER(PARTITION BY SaleDate) AS 'Percentage'
FROM
ProductSales ps
WHERE
SaleDate BETWEEN @startdt AND @enddt
GROUP BY
SaleDate, ProductType, ProductSubType
ORDER BY
SaleDate, ProductType
SUM(COUNT(*)) OVER(PARTITION BY SaleDate)
summarizes the counts and generates the total sum which you can use to get the percentage