been playing around with dense_rank, LAG, LEAD, ... all kinds of sorting but don't seem able to get this one solved.
Underneath a sample of my data and the result I expect/need.
pk_id | pk_id_row_num | result_Tech | source_id | source_descr | ranking |
---|---|---|---|---|---|
5649385 | 5649385_1 | 1 | Tech | 1 | |
5649385 | 5649385_2 | OK | 2 | IAC | 1 |
5437376 | 5437376_1 | 1 | Tech | 2 | |
5437376 | 5437376_2 | CANCEL | 1 | Tech | 2 |
5649387 | 5649387_1 | 1 | Tech | 3 | |
5649387 | 5649387_2 | OK | 2 | IAC | 3 |
5649387 | 5649387_3 | FWD | 1 | Tech | 4 |
5649387 | 5649387_4 | OK | 2 | IAC | 4 |
5649387 | 5649387_5 | FWD | 1 | Tech | 5 |
5649387 | 5649387_6 | OK | 2 | IAC | 5 |
5649387 | 5649387_7 | FWD | 1 | Tech | 6 |
5649387 | 5649387_8 | OK | 2 | IAC | 6 |
5647621 | 5647621_1 | 1 | Tech | 7 | |
5647621 | 5647621_2 | CANCEL | 1 | Tech | 7 |
5647621 | 5647621_3 | CANCEL | 1 | Tech | 7 |
5649364 | 5649364_1 | 1 | Tech | 8 | |
5649364 | 5649364_2 | OK | 2 | IAC | 8 |
5649364 | 5649364_3 | FWD | 1 | Tech | 9 |
5649364 | 5649364_4 | OK | 2 | IAC | 9 |
5649396 | 5649396_1 | 1 | Tech | 10 | |
5649396 | 5649396_2 | FWD | 2 | IAC | 10 |
5649396 | 5649396_3 | OK | 2 | IAC | 10 |
5652537 | 5652537_1 | 1 | Tech | 11 | |
5652537 | 5652537_2 | FWD | 2 | IAC | 11 |
5652537 | 5652537_3 | OK | 2 | IAC | 11 |
5652537 | 5652537_4 | FWD | 1 | Tech | 12 |
5652537 | 5652537_5 | OK | 2 | IAC | 12 |
5652537 | 5652537_6 | CANCEL | 1 | Tech | 12 |
This is about a tech raising a ticket for an operator. I need to group the correct actions with the correct request but there are several possibilities. The most common case is a request from a tech gets resolved by an operator. But a tech can open the ticket again (FWD in result_tech) after an operator has closed it. In this case this is considered a new sequence whithin the existing ticket. Also an operator can perform a 'FWD' to another operator. But than this remains in the same sequence. It comes down to this that, within the scope (tried partition by etc ) of 1 PK_ID, a combination of (result_tech is null and source_id = 1) or (result_tech = 'FWD' and source_id = 1) defines the start of the sequence and all following (ordered by pk_id_row_num) records define actions on this sequence, being a) a solution (result_tech = 'OK') or a forward (result_tech = 'FWD') from an operator (source_id = 2) b) a cancel from the tech (result_tech = 'CANCEL' and source_id = 1)
Mostly you'll have only 2 records in 1 pk_id/sequence but sometimes we have 3 and theoretically it could an infinitve number of records if the different operators keep on forwarding the request.
The collumn ranking is my desired result. This allows me to group the different actions whithin the scope of 1 pk_id to the correct sequence of events.
As said I tried several approaches The last one gets me some closer but not quite yet. There is still some fine tuning needed here. Hope somebody can help me out. There is no use of dense_rank() here but I have been playing with that too.
SELECT pk_id ,pk_id_source_id ,reason_id ,reason_desc ,result_tech ,source_id ,source_descr ,CASE WHEN (result_tech IS NULL OR result_tech = 'FWD') AND source_id = 1 THEN 'START' WHEN Lead(source_id,1,1) Over(PARTITION BY pk_id ORDER BY pk_id_source_id) <= source_id THEN 'NEXT' END sorting FROM My_example_table WHERE pk_id IN (5437376, 5647621, 5649364, 5649385, 5649387, 5649396, 5652537) ORDER BY pk_id_source_id;
I tried several combinations of windowf functions, mostly dense_rank, LAG and LEAD Played with several options to sort and or RESET, partitioning by, ...
The expected result is the last column. I just need to be able to identify, within the scope of 1 PK_ID, all records (start and actions) related to 1 sequence of events.
I usually solve these with a running sum/count like
Sum(Case When coalesce(result_tech,'FWD')='FWD' and source_id=1 Then 1 End)
Over (Partition By pk_id Order By pk_id_row_num Rows Between Unbounded Preceding And Current Row) as grouping
You can then write an outer SQL to use it in a Group By or whatever...
Note that most database systems assume that a count or sum in a window function by default is a running sum/count. If your DB is one of those, you can omit the windowing clause (Rows Between Unbounded Preceding And Current Row
)