Search code examples
sqloracle

How to get rid of 0 values from extract, as I think it's causing first and last row to aggregate and give false rows in Oracle?


What I am wanting to do, for each employee, is to get the difference between their clockin, and their next respective clockin in hours, and I am utilizing lag. The problem is, the below select gives me the resulting extract(single employee), and I think the 0/zero row is causing the issue, and also causing a false row(s) when I pull multiple employees(second extract screenshot).

Here is my select:

SELECT DISTINCT
 p.emp_name,
 e.emp_id,
 TO_CHAR(ce.clockin_dt_tm), 'MM/DD/YYYY HH24:MI' clockin,
 (NVL(ce.clockin_dt_tm - LAG(ce.clockin_dt_tm) OVER(PARTITION BY e.emp_id ORDER BY e.emp_id),0)*24) between_clockins

Resulting extract for single employee:

enter image description here

Resulting extract for multiple employees(red is employee one-same as single employee above, blue is employee 2):

enter image description here

I think if I can get rid of the 0/zero values, that will fix what I need.

  1. How would I go about getting rid and/or have SQL not consider the 0/zero values for calculation?
  2. If #1 is not the issue, is there anything I can adjust to make it come out correctly?

Any and all help/directions/references/guidance greatly appreciated.

Thanks, Roland


Solution

    • If you want to ignore the first row during aggregation then do not wrap the expressions calculating the difference in NVL and just leave it to be NULL.
    • If you want the difference between the current row and the previous row based on timestamps then ORDER BY ce.clockin_dt_tm and not ORDER BY e.emp_id.

    Like this:

    SELECT e.emp_name,
           e.emp_id,
           TO_CHAR(ce.clockin_dt_tm), 'MM/DD/YYYY HH24:MI' clockin,
           (
             ce.clockin_dt_tm
             - LAG(ce.clockin_dt_tm) OVER(
                 PARTITION BY e.emp_id ORDER BY ce.clockin_dt_tm
               )
           )*24 AS between_clockins
    FROM   employees e
           LEFT OUTER JOIN employee_clockins ce
           ON (ce.emp_id = e.emp_id)
    

    If you then want to aggregate and find the average difference:

    SELECT MAX(emp_name) AS emp_name,
           emp_id,
           AVG(between_clockins) AS average_between_clockins
    FROM   (
      SELECT e.emp_name,
             e.emp_id,
             TO_CHAR(ce.clockin_dt_tm), 'MM/DD/YYYY HH24:MI' clockin,
             (
               ce.clockin_dt_tm
               - LAG(ce.clockin_dt_tm) OVER(
                   PARTITION BY e.emp_id ORDER BY ce.clockin_dt_tm
                 )
             )*24 AS between_clockins
      FROM   employees e
             LEFT OUTER JOIN employee_clockins ce
             ON (ce.emp_id = e.emp_id)
    )
    GROUP BY emp_id
    

    If you also want to filter out the times when the difference was zero then add a WHERE filter:

    SELECT MAX(emp_name) AS emp_name,
           emp_id,
           AVG(between_clockins) AS average_between_clockins
    FROM   (
      SELECT e.emp_name,
             e.emp_id,
             TO_CHAR(ce.clockin_dt_tm), 'MM/DD/YYYY HH24:MI' clockin,
             (
               ce.clockin_dt_tm
               - LAG(ce.clockin_dt_tm) OVER(
                   PARTITION BY e.emp_id ORDER BY ce.clockin_dt_tm
                 )
             )*24 AS between_clockins
      FROM   employees e
             LEFT OUTER JOIN employee_clockins ce
             ON (ce.emp_id = e.emp_id)
    )
    WHERE  between_clockins > 0
    GROUP BY emp_id
    

    fiddle