Possible Duplicate:
Join vs. subquery
I need to work with a massive set of data, but need to on occasion exclude records based on a condition.
Since the amount of data being checked against, and also the amount of data that will be returned is in the tens of thousands at a time, and this will be run frequently, I was wondering if anybody could shed some light on which approach is best in terms of speed and load so keep things running as smoothly as we can
SELECT a.*
FROM table_a as a
LEFT JOIN table_b
ON table_b.a_id = a.id
WHERE table_b.status <> 'new'
OR
SELECT *
FROM table_a
WHERE id NOT IN (
SELECT a_id
FROM table_b
WHERE status <> 'new'
)
In most cases JOINs are faster than sub-queries and it is very rare for a sub-query to be faster.
In JOINs RDBMS can create an execution plan that is better for your query and can predict what data should be loaded to be processed and save time, unlike the sub-query where it will run all the queries and load all their data to do the processing.
The good thing in sub-queries is that they are more readable than JOINs: that's why most new SQL people prefer them; it is the easy way; but when it comes to performance, JOINS are better in most cases even though they are not hard to read too.
from here