I have a event table which has 40 columns and fill up to 2 billion records. In that event table i would like to query for a combination event i.e Event A with Event B. Sometimes I may want to find more combination like Event A with B and C. It may goes to 5 or 6 combination.
I don't want to scan that table for every event in combination i.e Scanning for event A and scanning for event B. And I need a generic approach for more combination scanning as well.
Note: That 2 billion records is partitioned based on event date and data is been equally split.
Eg:
Need to find id's which has event A,B,C and need to find id's which has only A,B.
This number of combination is dynamic. I don't want to scan that table for each event and finally intersect the result.
There may be some mileage in using a sql server equivalent of the mysql group_concat function. For example
drop table t
create table t (id int, dt date, event varchar(1))
insert into t values
(1,'2017-01-01','a'),(1,'2017-01-01','b'),(1,'2017-01-01','c'),(1,'2017-01-02','c'),(1,'2017-01-03','d'),
(2,'2017-02-01','a'),(2,'2017-02-01','b')
select id,
stuff(
(
select cast(',' as varchar(max)) + t1.event
from t as t1
WHERE t1.id = t.id
order by t1.id
for xml path('')
), 1, 1, '') AS groupconcat
from t
group by t.id
Results in
id groupconcat
----------- -----------
1 a,b,c,c,d
2 a,b
If you then add a patindex
select * from
(
select id,
stuff(
(
select cast(',' as varchar(max)) + t1.event
from t as t1
WHERE t1.id = t.id
order by t1.id
for xml path('')
), 1, 1, '') AS groupconcat
from t
group by t.id
) s
where patindex('a,b,c%',groupconcat) > 0
you get this
id groupconcat
----------- ------------
1 a,b,c,c,d