I'm working on a project involving words and its translations. One of the queries a translator must task frequently (once every 10 sec or so) is:
SELECT * FROM pwords p
LEFT JOIN words w ON p.id = w.wordid
WHERE w.code IS NULL
OR (w.code <> "USER1" AND w.code <> "USER2")
ORDER BY rand() LIMIT 10
To receive a word to be translated, which the user has not translated already. In this case we want to disallow words input by USER2
The pwords table has around 66k entries and the words table has around 55k entries.
This query takes about 500 seconds to complete, whereas if I remove the IS NULL the query takes 0.0245 ms. My question here is: is there a way to optimize this query? I really need to squeeze the numbers.
The scenario is: USER1 does not want any database entries from USER2 in the words table. It does not want it's own database entries from the same table. Therefore I need to have the IS NULL or a similar method to get entries from all users except USER1 and USER2, either from other users or NULL entries.
tl;dr So my question is: is there a way to make this query run faster? Is "IS NULL" optimizable?
Any and all help is greatly appreciated.
You can try using subquery in order to filter the rows (use WHERE
statement) as soon as possible:
SELECT *
FROM pwords p
LEFT JOIN
(SELECT *
FROM words w
WHERE (w.code <> "USER1" AND w.code <> "USER2")) subq
ON p.id = subq.wordid
WHERE w.code IS NULL
ORDER BY rand() LIMIT 10
another (and maybe a more efficient option) is using NOT EXISTS
statement:
SELECT *
FROM pwords p
WHERE NOT EXISTS
(SELECT *
FROM words w
WHERE p.id=w.wordid AND (w.code <> "USER1" AND w.code <> "USER2"))
ORDER BY rand() LIMIT 10