I have two large tables that I am joining using an intermediate mapping table (details of structure below).
I am trying to join all three on t1.date = t2.date
and t1.id_a = int.id_a
and int.id_b = t2.id_b
I also have a where clause that restricts the data to a very specific range of the date column (result set is ~25k rows).
Running either the join of table 1 and the int table (with where clause) or the join of table 2 with the int table (with where clause) takes literally 2 seconds each. It should then be trivial to join these two results sets that are around 37k rows for table 1 and 200k rows for table 2.
Instead it consistently takes 8 minutes for this query:
select t1.date, t1.id_b, t1.other_cols, t2.other_cols
from t1
inner join t_int on t1.id_a = t_int.id_a
inner join t_2 on t2.date = t1.date and t2.id_b = t_int.id_b
where t1.date between '2018-10-21' and '2018-12-10'
In the estimated (and actual) execution plan, SQL Server says it will do:
- clustered index seek on t1, seeking my date range (cost 33%)
- compute scalar t.id_a (cost 0%)
- clustered index seek on t2, seeking my date range (cost 33%)
- nested loop to join [2] and [3] (cost 0%)
- non clustered index seek on t_int, seeking t_int.id_a = t1.id_a and t_int.id_b = t2.id_b (cost 33%)
- nested loop to join [4] and [5] (cost 0%)
- compute scalar t.date, t_int.id_b (cost 0%)
Table 1:
date,
id_a,
other columns
(3.2m rows, date and id_a are primary key w clustered index)
Table 2:
date,
id_b,
other columns
(18.5m rows, date and id_b are primary key w clustered index)
Mapping table:
id_a,
id_b,
other columns
(35k rows, id_b is primary key w clustered index, additional non_clustered index on [id_a, id_b, other_col
])
There are no other indexes and no constraints at all (apart from the mentioned primary key constraints).
I have already rebuilt the index on t2
Can anyone help with what I need to do?
I had already rebuilt the index on t2 because it was fragmented. But I hadn't rebuilt the index on t1 or t_int since they looked fine.
Thanks to Mohammad Mohabbati's suggestion in the comments, I rebuilt them all and the query now runs in less than 1 second.
So the answer here, which may be useful to others with similar problems, is "rebuild all the indexes, even if they look fine"