Search code examples
mysqlsqlexpressionengine

Slow aggregate query with join on same table


I have a query to show customers and the total dollar value of all their orders. The query takes about 100 seconds to execute.

I'm querying on an ExpressionEngine CMS database. ExpressionEngine uses one table exp_channel_data, for all content. Therefore, I have to join on that table for both customer and order data. I have about 14,000 customers, 30,000 orders and 160,000 total records in that table.

Can I change this query to speed it up?

SELECT link.author_id          AS customer_id, 
       customers.field_id_122  AS company, 
       Sum(orders.field_id_22) AS total_orders 
FROM   exp_channel_data customers 
       JOIN exp_channel_titles link 
         ON link.author_id = customers.field_id_117 
            AND customers.channel_id = 7 
       JOIN exp_channel_data orders 
         ON orders.entry_id = link.entry_id 
            AND orders.channel_id = 3 
GROUP  BY customer_id 

Thanks, and please let me know if I should include other information.

UPDATE SOLUTION

My apologies. I noticed that entry_id for the exp_channel_data table customers corresponds to author_id for the exp_channel_titles table. So I don't have to use field_id_117 in the join. field_id_117 duplicates entry_id, but in a TEXT field. JOINING on that text field slowed things down. The query is now 3 seconds

However, the inner join solution posted by @DRapp is 1.5 seconds. Here is his sql with a minor edit:

SELECT
  PQ.author_id CustomerID,
  c.field_id_122 CompanyName,
  PQ.totalOrders
FROM
  ( SELECT
       t.author_id
          SUM( o.field_id_22 ) as totalOrders
       FROM
          exp_channel_data o
       JOIN 
          exp_channel_titles t ON t.author_id = o.entry_id AND o.channel_id = 3
       GROUP BY
          t.author_id ) PQ
  JOIN 
     exp_channel_data c ON PQ.author_id = c.entry_id AND c.channel_id = 7
  ORDER BY CustomerID

Solution

  • If this is the same table, then the same columns across the board for all alias instances. I would ensure an index on (channel_id, entry_id, field_id_117 ) if possible. Another index on (author_id) for the prequery of order totals

    Then, start first with what will become an inner query doing nothing but a per customer sum of order amounts.. Since the join is the "author_id" as the customer ID, just query/sum that first. Not completely understanding the (what I would consider) poor design of the structure, knowing what the "Channel_ID" really indicates, you don't want to duplicate summation values because of these other things in the mix.

    select
          o.author_id,
          sum( o.field_id_22 ) as totalOrders
       FROM
          exp_channel_data customers o
       where
          o.channel_id = 3
       group by
          o.author_id
    

    If that is correct on the per customer (via author_id column), then that can be wrapped as follows

    select
          PQ.author_id CustomerID,
          c.field_id_122 CompanyName,
          PQ.totalOrders
       from
          ( select
                  o.author_id,
                  sum( o.field_id_22 ) as totalOrders
               FROM
                  exp_channel_data customers o
               where
                  o.channel_id = 3
               group by
                  o.author_id ) PQ
          JOIN exp_channel_data c
             on PQ.author_id = c.field_id_117 
            AND c.channel_id = 7