Search code examples
mysqlsqlscalar-subquery

Scalar subquery taking a lot of time to execute


I have the following tables:

  1. table1 - session_id, company_id, session_start_time
  2. table2 - id, session_id, message_time, message_type, message
  3. table3 - company_id, company_name

table1 stores the sessions done on companies. Each session has a lot of messages, which are stored in table2. For a specific message_type (say message_type = 2), a substring needs to be extracted from message. This substring is a company name. Now I need to find the next session_id from table1, where the company_id matches that of the company name extracted from message and the session_start_time >= message_time.

I am using the following query.

select t1.session_id as session1,
       t1.company_id as company1,
       @transfer_time := t2.message_time as transfer_time,
       @company2 := trim(substring(t2.message, 38, locate(' abc', t2.message) - 38)) as company2,
       (select t1.session_id
        from table1 as t1
          inner join table3 as t3 on t1.company_id = t3.company_id
        where t1.session_start_time >= @transfer_time
          and t3.company_name = @company2
       order by t1.session_start_time
       limit 1) as session 2
from table1 as t1
  inner join table2 as t2 on t1.session_id = t2.session_id
  inner join table3 as t3 on t1.company_id = t3.company_id
where t2.message_type = 2

The original query is slightly more complicated having a few more scalar sub-queries. This query is taking exceptionally long to execute. I checked with the explain function and the scalar sub-query seems to be taking very long to execute. However, I am unable to think of a better way.


Solution

  • This is the query:

    select t1.session_id as session1, t1.company_id as company1,
           @transfer_time := t2.message_time as transfer_time,
           @company2 := trim(substring(t2.message, 38, locate(' abc', t2.message) - 38)) as company2,
           (select t1.session_id
            from table1 t1 inner join
                 table3 t3
                 on t1.company_id = t3.company_id
            where t1.session_start_time >= @transfer_time and
                  t3.company_name = @company2
            order by t1.session_start_time
            limit 1
           ) s session 2
    from table1 t1 inner join
         table2 t2
         on t1.session_id = t2.session_id inner join
         table3 t3
         on t1.company_id = t3.company_id
    where t2.message_type = 2;
    

    First, the use of variables is not correct. MySQL does not guarantee the order of evaluation of variables in a SELECT. So, you need to put in the original definitions:

    select t1.session_id as session1, t1.company_id as company1,
           t2.message_time as transfer_time,
           trim(substring(t2.message, 38, locate(' abc', t2.message) - 38)) as company2,
           (select t1.session_id
            from table1 t1 inner join
                 table3 t3
                 on t1.company_id = t3.company_id
            where t1.session_start_time >= t2.message_time and
                  t3.company_name = trim(substring(t2.message, 38, locate(' abc', t2.message) - 38)) 
            order by t1.session_start_time
            limit 1
           ) s session 2
    from table1 t1 inner join
         table2 t2
         on t1.session_id = t2.session_id inner join
         table3 t3
         on t1.company_id = t3.company_id
    where t2.message_type = 2;
    

    Next, I would try indexes: table3(company_name, company_id) and table1(company_id, session_start_time, session_id).