Search code examples
sql-server-2005mode

Replicate Excel's mode function using SQL?


How can I replicate Excel's mode function using SQL?

If I run the mode function on a set of numbers in Excel it will return one mode value even if there are multiple mode values. I need some SQL to work in the same way.

For this series of numbers Excel returns a mode of 8. This is probably because 8 is the first modal number to appear.

6
7
8
3
3
8
0
2
2

If there is no mode example all numbers are unique it should return NA.

This is the code I have so far.

How can I replicate Excel's mode function using SQL? If I run the mode function on a set of numbers in Excel it will return one mode value even if there are multiple mode values. I need some SQL to work in the same way.

This is what I have so far. deleting rows where occurences=1 is to deal with series with no mode.

 --I wanted to use CTE for Mode, but it won't work as part of a union query
select RIC,Period,InputFile,Occurrences,amount into #Mode1 from
(SELECT aa.RIC,aa.Period,aa.inputfile,aa.amount,COUNT(*) AS occurrences
FROM tempPivotTable aa
--where aa.ric='USTRDAP' and aa.period='2006' and aa.inputfile='C:\FalconIngest\Input\US April 2006.xls'
GROUP BY aa.RIC,aa.Period,aa.inputfile,aa.amount) as A

Select RIC,vendor,Period,Filedate,inputfile,YearlyOrQuarterly,sortablePeriod,NumericFiledate,MaxAmount into #Mode2 from
(
select t.Ric,'O' as vendor,t.Period,Filedate,t.inputfile,YearlyOrQuarterly,sortablePeriod,NumericFiledate,max(occurrences) as MaxAmount
from TempPivotTable t
inner join #Mode1 A
on t.ric=a.ric and t.period=a.period and t.inputfile=a.inputfile
group by t.Ric,t.Period,Filedate,t.inputfile,YearlyOrQuarterly,sortablePeriod,NumericFiledate
)as A

Select RIC,vendor,Period,Filedate,inputfile,YearlyOrQuarterly,sortablePeriod,NumericFiledate,Amount,occurrences into #Mode3 from
(
select a.RIC, 'O' as vendor,a.period,Filedate,a.inputfile,YearlyOrQuarterly,sortablePeriod,NumericFiledate,Amount,occurrences
from #Mode1 A
inner join #Mode2 M
on A.ric=M.ric and A.period=M.period and A.inputfile=M.Inputfile
where occurrences=maxamount 
) as A

--deal with cases where there is no mode
select ric,vendor,period,Filedate,inputfile,YearlyOrQuarterly,sortablePeriod,NumericFiledate,Amount
into #mode4 
from(
select   ric,'O' as vendor,period,Filedate,inputfile,YearlyOrQuarterly,sortablePeriod,NumericFiledate,0 as Amount from #mode3
where occurrences=1 
group by ric,period,Filedate,inputfile,YearlyOrQuarterly,sortablePeriod,NumericFiledate
having count(*)>1
) as A
delete from #mode3 where occurrences=1 

select a.RIC, 'O' as vendor,a.period,Filedate,a.inputfile,YearlyOrQuarterly,sortablePeriod,NumericFiledate,Amount
from #Mode1 A
inner join #Mode2 M
on A.ric=M.ric and A.period=M.period and A.inputfile=M.Inputfile
where occurrences=maxamount and maxamount>1
union select * from #mode4  --Put series with no mode as NA

drop table #mode1
drop table #mode2
drop table #mode3
drop table #mode4

subsquently, I've come up with this simplified code.

select Code,inputfile,period,Amount,count(*) as Amountcount,
Ranking=dense_Rank() over (partition by Code,period,inputfile order by count(*) desc)
from TempPivotTable
group by Code,inputfile,period,Amountmore of those amount is the modal amount.  

It's OK where there is one mode value. In the example below 3 and 8 are the mode values. Where there are multiple mode values it must choose 8 as that appears first in the alphabetical list of vendors.

Vendor Amount A 6 B 7 C 8 D 3 E 3 F 8 G 0 H 2 I 2


Solution

  • Below statement return mode value if it is single, mode value of the first (alphabetically) Vendor if set is multimodal, and return NULL if all values unique (no mode exist).

    ;with m as (
        select *,
            aCnt = count(1) over (partition by amount),
            tCnt = count(1) over ()
        from TableName
    )
    select top (1)
        case
            when acnt = 1 and tCnt > 1 then NULL
            else amount
        end as mode
    from m
    order by acnt desc, vendor;
    

    SQLFiddle sample

    To find mode within (inputfile, code, period) tuple you may try:

    ;with r1 as (
        select inputfile, code, period, vendor, amount,
            acnt = count(1) over (partition by inputfile, code, period, amount),
            tcnt = count(1) over (partition by inputfile, code, period)
        from TableName
    ),
    r2 as (
        select inputfile, code, period, amount, acnt, tcnt,
            rn = row_number() over (partition by inputfile, code, period order by acnt desc, vendor)
        from r1
    )
    select inputfile, code, period,
        mode = case when acnt = 1 and tcnt > 1 then NULL else amount end
    from r2
    where rn = 1
    

    SQLFiddle sample