I'm working on existing database with millions of inserts per day. Database design itself pretty bad and filtering records from it takes huge amount of time. we are in the process of moving this to ELK cluster but in the mean time I have to filter some records for immediate use.
I have two tables like this
table - log_1
datetime | id | name | ip
2017-01-01 01:01:00 | 12345 | sam | 192.168.100.100
table - log_2
datetime | mobile | id
2017-01-01 01:01:00 | 999999999 | 12345
I need to filter my data using ip and from the log_1 and datetime on both log_1 and log_2. to do that I use below query
SELECT log_1.datetime, log_1.id, log_1.name, log_1.ip, log_2,datetime, log_2.mobile, log_2.id
FROM log_1
INNER JOIN log_2
ON log_1.id = log_2.id AND log_1.datetime = log_2.datetime
where log_1.ip = '192.168.100.100'
limit 100
Needless to say this take forever to retrieve results with such large number of records. is there any better method I can do the same thing without waiting long time mysql to respond ?. In other words how can I optimized my query against such large database.
database is not production and it's for just analytics
First of all, your current LIMIT
clause is fairly meaningless, because the query has no ORDER BY
clause. It is not clear which 100 records you want to retain. So, you might want to use something like this:
SELECT
l1.datetime,
l1.id,
l1.name,
l1.ip,
l2.datetime,
l2.mobile,
l2.id
FROM log_1 l1
INNER JOIN log_2 l2
ON l1.id = l2.id AND l1.datetime = l2.datetime
WHERE
l1.ip = '192.168.100.100'
ORDER BY
l1.datetime DESC
LIMIT 100;
This would return the 100 most recent matching records. As for speeding up this query, one way to at least make the join faster would be to add the following index on the log_2
table:
CREATE INDEX idx ON log_2 (datetime, id, mobile);
Assuming MySQL chooses to use this index, it should make the join much faster, because each id
and datetime
value can be looked up in a B-tree instead of doing a manual scan of the entire table. Note that the index also covers the mobile
column, which is needed in the select.