Search code examples
sql-servermathmeancalculated-columnsmedian

MS SQL Calculate MODE on GROUPED data


I have a table of Items, each marked with Latitude and Longitude and the Date it was entered into the table.

Each Item has a Supplier.

For each Supplier, I want to group their Items by Lat/Lng to find out how old most of the items are in each location - i.e. the MODE. I will then display this on a Google map with traffic light markers showing where data is old/new.

I've tried following other SQL MODE answers but they aren't grouping by any column, just getting the mode of one set of data.

Here's my code:

DECLARE @SupplierID int
SET @SupplierID = 12345

WITH cte AS
(
SELECT 
Round(p.Latitude,2) As Latitude,
Round(p.Longitude,2) As Longitude,
CASE 
WHEN p.Date >= getdate()-30 THEN 1 
WHEN p.Date >= getdate()-60 AND p.Date < getdate()-30 THEN 2 
WHEN p.Date >= getdate()-90 AND Date < getdate()-60 THEN 3 
WHEN p.Date < getdate()-90 THEN 4 END As Age,
Count(*) As Counter
FROM Items p

WHERE
p.SupplierID = @SupplierID AND
p.Latitude is not null AND
p.Longitude is not null

GROUP BY Round(p.Latitude,2),Round(p.Longitude,2),CASE 
WHEN p.Date >= getdate()-30 THEN 1 
WHEN p.Date >= getdate()-60 AND p.Date < getdate()-30 THEN 2 
WHEN p.Date >= getdate()-90 AND Date < getdate()-60 THEN 3 
WHEN p.Date < getdate()-90 THEN 4 END
)

SELECT * FROM cte

This gets me data like this:

Lat    Lng    Age    Counter
12.34  56.78  1      5
12.34  56.78  3      2
12.34  56.78  4      24

23.45  67.89  1      21
23.45  67.89  2      16
23.45  67.89  3      13

Now I need to select which Age is most prevalent grouping by each Lat/Lng. Despite hours of fiddling, I can't seem to figure it out.

The data I would expect would be:

Lat    Lng    Mode_Age
12.34  56.78  4
23.45  67.89  1

(I am rounding the Lat/Lng to reduce the data points.)


Solution

  • You can use ROW_NUMBER to get the top item per grouping

    DECLARE @SupplierID int = 12345;
    
    WITH cte AS
    (
        SELECT 
          Round(p.Latitude, 2) As Latitude,
          Round(p.Longitude, 2) As Longitude,
          v.Age,
          Count(*) As Counter,
          ROW_NUMBER() OVER (
            PARTITION BY Round(p.Latitude, 2),
                         Round(p.Longitude, 2)
            ORDER BY COUNT(*) DESC) As Rn
        FROM Items p
        CROSS APPLY (VALUES (
          CASE 
          WHEN p.Date >= getdate()-30 THEN 1 
          WHEN p.Date >= getdate()-60 AND p.Date < getdate()-30 THEN 2 
          WHEN p.Date >= getdate()-90 AND Date < getdate()-60 THEN 3 
          ELSE 4 END
        ) ) As v(Age)
    
        WHERE
          p.SupplierID = @SupplierID AND
          p.Latitude is not null AND
          p.Longitude is not null
    
        GROUP BY
          Round(p.Latitude, 2),
          Round(p.Longitude, 2),
          v.Age
    )
    
    SELECT
      Latitude,
      Longitude,
      Age
    FROM cte
    WHERE Rn = 1;
    

    Note the use of CROSS APPLY (VALUES to remove the code duplication