Search code examples
sqlsybase

Not getting expected output when calculating percentage in SQL using Count


I'm running a query but it doesn't output what I expect it to. I have the table below:

------------------------
security_ID | Date | Rep 
------------------------
2256        |202001|  0
2257        |202002|  0
2257        |202003|  0
2256        |202002|  1
2256        |202003|  2
2257        |202003|  1

I'm essentially trying to find the % of cases where the column Rep changes from 0 to 1, from one date to the next, when Rep was 0 for the previous date, for a given security_ID. The difference in the Dates should be 1 (for eg. with 202002-202001 = 1. Dates are integer here) for the calculation. Here, for the security_ID = 2256, the Percent = 100 as from 202001 to 202002, Rep changes from 0 to 1, and in the table the number of rows where 2256 was 0 is 1. The equation for Percent is:

Percent = (No of cases where Rep_current = 1 and Rep_prev = 0)/(No of cases where Rep_prev = 0) * 100

For security_ID = 2257, Percent = 1/2 * 100 = 50

For example, I want the output to be:

----------------------------------
security_ID | Date | Rep | Percent
----------------------------------
2256        |202001|  0  |  100
2257        |202002|  0  |  50 
2257        |202003|  0  |  50 
2256        |202002|  1  |  100 
2256        |202003|  2  |  100
2257        |202003|  1  |  50

I tried to do this as follows:

SELECT security_ID, Date, Rep, 
(COUNT(CASE WHEN Rep_prev = 0 and Rep = 1 then 1 else 0 end)/count(CASE WHEN Rep_prev = 0 then 1 else 0 end) * 100) as "Percent"
from
(
    select t1.security_id,t1.date,t1.rep,
           coalesce(t2.rep,0) as Rep_prev
      from mytable t1
      left join mytable t2
        on t1.security_id = t2.security_id
       and t2.date = t1.date - 1
    )
GROUP BY SECURITY_ID, Date, Rep

But I get the output as:

----------------------------------
security_ID | Date | Rep | Percent
----------------------------------
2256        |202001|  0  |  100
2257        |202002|  0  |  100 
2257        |202003|  0  |  100 
2256        |202002|  1  |  100 
2256        |202003|  2  |  100
2257        |202003|  1  |  100

Not too sure where my logic is off here.

Please let me know if youd like more info, cause its hard to put this idea across.


Solution

  • You can use the below logic (using Windows function) -

    Select mt.security_id, mt.Date, mt.Rep, (numerator*100)/denominator As "Percent"
    from
      (Select security_id,
           SUM(case when Rep_curr = 1 and coalesce(Rep_prev,-99) = 0 then 1 else 0 end) As numerator,
           SUM(case when coalesce(Rep_prev,-99) = 0 then 1 else 0 end) As denominator
        from
         (Select security_id, 
                 Rep As Rep_curr,
                 lag(Rep,1) over(partition by security_id order by Date) As Rep_prev
          from my_table) t 
       where t.Rep_prev is not NULL
       group by security_id) tab
       join my_table mt 
       on tab.security_id = mt.security_id;
    

    Here is a db fiddle link that demonstrates how it works in SQL Server - https://dbfiddle.uk/?rdbms=sqlserver_2014&fiddle=8654f0242bbed911f1ed63b795281bac. The above syntax would work on Sybase as well.

    EDIT : If I were to use your methodology, here is how I would do it -

    SELECT mt.*, Tab1.Perct As "Percent"
    FROM
        (SELECT security_ID,
                (SUM(CASE WHEN Rep_prev = 0 and Rep = 1 then 1 else 0 end) * 100/SUM(CASE WHEN Rep_prev = 0 then 1 else 0 end)) as Perct
        from
           (
            select t1.security_id,t1.date,t1.rep,
                   coalesce(t2.rep,0) as Rep_prev
              from mytable t1
              left join mytable t2
                on t1.security_id = t2.security_id
               and t2.date = t1.date - 1
              where  t2.rep IS NOT NULL
            ) t
        GROUP BY SECURITY_ID ) Tab1
        JOIN mytable mt
        ON TAB1.security_ID = mt.security_ID;