Extracting data based on data in multiple columns and rows

I have the following data which represents a call flow coming into our centre and being transferred to a consultant (CSO).

CallID   Sequence    Action          Location         Input    NextLocation

1135     0           CallStart       NULL             NULL     NULL    
1135     1           MenuStart       EFTPosHelpDesk   NULL     NULL    
1135     2           KeyPress        EFTPosHelpDesk   3        TransferCSO
1135     3           TransferEntry   EFTPosHelpDesk   NULL     NULL
1135     4           TransferFlag    NULL             NULL     NULL
1135     5           AccessNum       NULL             NULL     NULL
1135     6           Transfer        NULL             NULL     NULL
1135     7           Hangup          NULL             NULL     NULL

I'm using MS SQL Management Studio 2005 to extract the data, however the data itself is stored on a MS SQL 2000 server.

There are literally millions of calls recorded in the database and I need to extract the CallID's where a customer was transferred to a CSO. The target data that tells me a call was transferred to a CSO is exactly as follows:

@Sequence 2 -> NextLocation = 'TansferCSO'
@Sequence 3 -> Action = 'TransferEntry'
@Sequence 4 -> Action = 'TransferFlag'
@Sequence 5 -> Action = 'AccessNum'
@Sequence 6 -> Action = 'Transfer'
@Sequence 7 -> Action = 'Hangup'

... and it's always in that sequence, but the "Sequence" numbering and "Location" will vary as some calls can run for 50 - 70 steps on average and we have 100's of IVRs (Locations).

I'm fairly new to SQL and I've tried using FETCH and IF / ELSE but without success. ROW_NUMBER() wont work due to the data being on a MS SQL2000 server.

Any examples or advice would be greatly appreciated.

Solution

This will give you all the calls (CallID's) that have been transferred to CSO based on all 6 of the entries being present (in any sequence).

  select CallID
    from yourTable
   where Action in ('TransferEntry', 'TransferFlag',
                    'AccessNum', 'Transfer', 'Hangup')
      or NextLocation = 'TransferCSO'
group by CallID
  having count(distinct Action) = 6;

Note: The assumption is that the row with NextLocation='TransferCSO' has a different but specific action, e.g. "KeyPress"

If you absolutely need to ensure that the 6 events are consecutive, you can use the below. A minor variation to the JOIN conditions at the end can also be used for in-order without being necessarily consecutive.

create table #tmpCalls (CallID int, Sequence int, Action varchar(20));

select c.CallID, c.Sequence, c.Action
  from
(
      select CallID
        from yourTable
       where Action in ('TransferEntry', 'TransferFlag',
                        'AccessNum', 'Transfer', 'Hangup')
          or (Action = 'KeyPress' and NextLocation = 'TransferCSO')
    group by CallID
      having count(distinct Action) = 6
) a
  join yourTable c on c.CallID = a.CallID
 where c.Action in ('TransferEntry', 'TransferFlag',
                    'AccessNum', 'Transfer', 'Hangup')
    or (c.Action = 'KeyPress' and c.NextLocation = 'TransferCSO');

create clustered index #ix_tmpCalls on #tmpCalls(CallID, Sequence, Action);

select distinct a.CallID
  from #tmpCalls a                 -- or perhaps just:  b.Sequence > a.Sequence
  join #tmpCalls b on b.Action = 'TransferEntry' and b.Sequence = a.Sequence + 1
  join #tmpCalls c on c.Action = 'TransferFlag' and c.Sequence = b.Sequence + 1
  join #tmpCalls d on d.Action = 'AccessNum' and d.Sequence = c.Sequence + 1
  join #tmpCalls e on e.Action = 'Transfer' and e.Sequence = d.Sequence + 1
  join #tmpCalls f on f.Action = 'Hangup' and f.Sequence = e.Sequence + 1
 where a.Action = 'KeyPress' and a.NextLocation = 'TransferCSO';

Note that the subquery is the original query to narrow down the candidates. The temporary table is used to make it perform quickly since we can cluster it around the 3 columns.