PSQL: My main table has 20 million records and when I run my select it runs for hours. Is there a way to better write this statment ?
My table :
Select * from lookup limit 10;
-------------------------
month id
2010-01 598362
2010-01 598343
2010-02 598343
2010-02 988343
2010-03 789624
2010-04 789624
2010-05 789624
2010-06 899624
My select statement (below) works fine with small data (upto 100,000 records)
--PSQL
select month,
count (distinct id)id_ct,
count (distinct case when (
id||(month-1) in (select distinct id||month from lookup )
or id||(month-2) in (select distinct id||month from lookup ) )
then id end) continuous_ct
from lookup
group by 1 order by 1
Result :
month id_ct continuous_ct
2010-01 2 0
2010-02 2 1
2010-03 1 0
2010-04 1 1
2010-05 1 1
2010-06 1 0
Thanks!
If you have an index on the id then this simple query should work. Just left join the table to itself. http://sqlfiddle.com/#!15/133a9/7/0:
select to_char(lookup_a.lookup_month, 'yyyy-mm'),
count (distinct lookup_a.id) id_ct,
count (distinct lookup_b.id) continuous_ct
from lookup lookup_a
left join lookup lookup_b
on lookup_b.id = lookup_a.id
and lookup_b.lookup_month between lookup_a.lookup_month - interval '2 month'
and lookup_a.lookup_month - interval '1 month'
group by 1
order by 1