Search code examples
sql-server-2008t-sqljoinjoin-hints

SQL SERVER 2008 JOIN hints


Recently, I was trying to optimise this query

UPDATE Analytics
SET UserID = x.UserID
FROM Analytics z 
INNER JOIN UserDetail x ON x.UserGUID = z.UserGUID

Estimated execution plan show 57% on the Table Update and 40% on a Hash Match (Aggregate). I did some snooping around and came across the topic of JOIN hints. So I added a LOOP hint to my inner join and WA-ZHAM! The new execution plan shows 38% on the Table Update and 58% on an Index Seek.

So I was about to start applying LOOP hints to all my queries until prudence got the better of me. After some googling, I realised that JOIN hints are not very well covered in BOL. Therefore...

  1. Can someone please tell me why applying LOOP hints to all my queries is a bad idea. I read somewhere that a LOOP JOIN is default JOIN method for query optimiser but couldn't verify the validity of the statement?
  2. When are JOIN hints used? When the sh*t hits the fan and ghost busters ain't in town?
  3. What's the difference between LOOP, HASH and MERGE hints? BOL states that MERGE seems to be the slowest but what is the application of each hint?

Thanks for your time and help people!

I'm running SQL Server 2008 BTW. The statistics mentioned above are ESTIMATED execution plans.


Solution

  • Can someone please tell me why applying LOOP hints to all my queries is a bad idea. I read somewhere that a LOOP JOIN is default JOIN method for query optimiser but couldn't verify the validity of the statement?

    Because this robs the optimizer of the opportunity to consider other methods which can be more efficient.

    When are JOIN hints used? When the sh*t hits the fan and ghost busters ain't in town?

    When the data distribution (on which the optimizer makes its decisions) is severely skewed and the statistics are no able to represent it correctly.

    What's the difference between LOOP, HASH and MERGE hints? BOL states that MERGE seems to be the slowest but what is the application of each hint?

    These are different algorithms.

    1. LOOP is nested loops: for each record from the outer table, the inner table is searched for matches (using the index of available). Fastest when only a tiny portion of records from both tables satisfy the JOIN and the WHERE conditions.

    2. MERGE sorts both tables are traverses them in the sort order, skipping the unmatched records. Fastest for the FULL JOINs and when both recordsets are already sorted (from previous sort operations or when the index access path is used)

    3. HASH build a hash table in the temporary storage (memory or tempdb) from one of the tables and searches it for each record from the other one. Fastest if the large portion of records from either table matches the WHERE and JOIN condition.