I have a list of ID numbers as one table in my mySQL database; I have a second table that has From_ID
, To_ID
, and Frequency
columns.
I'd like to make a third table that has the same structure as the second, but with only those rows for which the 'from' and 'to' IDs are in the first table.
The first table has ≈ 80k rows and the second has ≈ 45 million. It is taking so long that the process does not seem to end in a reasonable time (not less than a day).
My current query is as follows:
CREATE table3 AS (SELECT * FROM table2
WHERE from_id IN (SELECT id FROM table1)
AND to_id IN (SELECT id FROM table1);
If anybody could tell me a more efficient way to go about this I would really appreciate it!
First, use exists
rather than in
:
SELECT t2.*
FROM table2 t2
WHERE EXISTS (SELECT 1 FROM table1 t1 WHERE t1.id = t2.from_id) AND
EXISTS (SELECT 1 FROM table1 t1 WHERE t1.id = t2.to_id);
Then be sure that you have an index on table1(id)
. The latter is really important.
As a note: you can test the query in a user interface by putting limit 100
, then limit 1000
etc. onto the query. This will let you see what the performance is like as the data grows.