I have a large table (large_table, 6M+ rows) with indexed columns
mid_ref_id (integer) references the primary key of a medium table (mid_table, 200k+ rows)
The data of small_table is divided by groups of data that are named in the indexed column group_name (varchar).
The data of mid_table is divided by types of data that are named in the indexed column type (varchar).
The large table has 2 indexed timestamps (date_time1,date_time2).
My query looks like :
select * from large_table, mid_table, small_table
where large_table.small_ref_id=small_table.id
and large_table.mid_ref_id=mid_table.id
and small_table.group_name='MyGroup'
and
(large_table.id in (select large_table.id from large_table, mid_table
where mid_table.id=large_table.mid_ref_id
and large_table.date_time1 between '2010-01-01' and '2017-01-01'
and mid_table.type='Type1')
or large_table.id in (select large_table.id from large_table, mid_table
where mid_table.id=large_table.mid_ref_id
and large_table.date_time2 between '2010-06-01' and '2017-01-01'
and mid_table.type='Type2')
or large_table.id in (select large_table.id from large_table, mid_table
where mid_table.id=large_table.mid_ref_id
and mid_table.date_time3 between '2010-08-01' and '2017-01-01'
and mid_table.type='Type3'))
It can take several minutes (<5min) to fetch the results.
What I tried :
Any ideas ?
It seems to me you don't need subqueries. You can just apply the three conditions to the large_table
and mid_table
data you already have from the main query:
select *
from large_table
inner join mid_table
on large_table.mid_ref_id = mid_table.id
inner join small_table
on large_table.small_ref_id = small_table.id
and small_table.group_name = 'MyGroup'
where (
(large_table.date_time1 between '2010-01-01' and '2017-01-01'
and mid_table.type = 'Type1')
or
(large_table.date_time2 between '2010-06-01' and '2017-01-01'
and mid_table.type = 'Type2')
or
(mid_table.date_time3 between '2010-08-01' and '2017-01-01'
and mid_table.type = 'Type3')
)
As a side note: use the inner join
syntax.
NB: Are you sure about the last condition? It tests on mid_table.date_time3
while the other two conditions take their date from large_table
...