Search code examples
mysqlsqlinner-joincorrelated-subquerysqlperformance

SQL inner join - performance improvement


I have the following SQL but wish to improve the performance (Option 2 below currently takes about 24 seconds).

EXPLAIN MySQL - using example in answer belowenter image description here

Option 1 - basic select (very slow)

select f.name, f.flowid, m.traceid, m.traceday, m.logtimestamp
from flow f, messageinfo m 
where traceid = (select max(traceid) from messageinfo where flowid = f.flowid)

UPDATE

Option 2 - with inner join (faster but still too slow)

select m.traceid, f.name, f.flowid, m.traceday, m.logtimestamp
from flow f
  inner join messageinfo m 
  on m.flowid = f.flowid
  where traceid = (select max(traceid) from messageinfo where flowid = 
  f.flowid)
order by f.name

I need to loop through the flow records (there are approx 900) and for each flow select the related flow in the messageinfo table that is the most recent, i.e. has the highest traceid for that particular flow. Both SQL examples above work but would like to see if I can improve performance for Option 2 if possible. The final result should be a best effort and only return a traceid where a related flow record can be found (like an inner join). Can anyone suggest a more performant way? Bear in mind I would like to return the data from both tables in the final result (as shown with example above).

I need to also emphasize I don't have the freedom to modify the DB with new/updated indexes. So mainly looking for improvements to SQL without DB modifications.

That said, it is still beneficial noting any index improvements for future work.


Solution

  • Another way to rewrite your query as using join, Move your dependent sub query part to sub clause and join this with your main query.

    select m.traceid, f.name, f.flowid, m.traceday, m.logtimestamp
    from flow f
    inner join messageinfo m on m.flowid = f.flowid
    inner join (
        select flowid, max(traceid) traceid
        from messageinfo 
        group by flowid
    ) m1 on  m.flowid = m1.flowid and m.traceid = m1.traceid
    order by f.name
    

    Also add composite index on (flowid,traceid)