Search code examples
sqlhiveapache-spark-sql

Calculating time taken between rows in SQL


I'm new to SQL thus the question. I've the following table.

id | prev_state | curr_state | start_curr_state |selected_to_interview|interview_to_selected|
1  | APPLICANT. | INTERVIEW  | 2020-10-12.      | ?                   |
2  | NULL       | APPLICANT. | 2020-10-16       | NULL                | NULL.               |
1  | INTERVIEW. | SELECTED.  | 2020-1--20.      | NULL                | ?                   |

I've a set of 3 predetermined states of candidates - APPLICANT, INTERVIEW & SELECTED. The table provides a record of candidates who transitioned from one stage to another. The start_curr_state is the start date for the curr state - row 1 start of interview state. Depending upon the states transitioned, I need to calculate the time taken to transition for each stage.

I understand the time is date diff start_curr_state for the prev_state - start_curr_state But I'm unsure of how to calculate this. Any help appreciated.


Solution

  • you can use LAG() , I assumed you have a key called 'candidateid' for each person:

    select t1.* 
    , CASE WHEN t1.curr_state = 'INTERVIEW' AND prev_state = 'APPLICANT' 
    THEN datediff( lag(t1.start_curr_state) over (partition by t1.candidateid order by t1.start_curr_state) ,t1.start_curr_state) 
    ELSE NULL END  selected_to_interview
    from tablename t1