Search code examples
sql-servert-sqlquery-optimizationin-subquery

Better tsql query to figure out who's a new customer (or alternative to NOT EXISTS)


I have an table with all our orders. The orders are linked to a person. And every person is attached to a company. Now I need a list of all companies have never ordered before a certain date. The query works fine, but it's very slow.

This is my T-SQL query:

SELECT
    DISTINCT p1.company_id
FROM
    order o
    JOIN person p1 ON (o.person_id = p1.id AND p1.company_id IS NOT NULL)
WHERE
    o.orderDate > '2017-01-01'
AND
    o.orderDate < '2017-09-01'
AND NOT EXISTS (SELECT
                    p2.company_id
                FROM
                    order o2
                    JOIN person p2 ON (o2.person_id = p2.id AND p2.company_id = p1.company_id)
                WHERE
                    o2.orderDate < '2017-01-01')

I've already changed it from a NOT IN to a NOT EXISTS. Since this was what most people here recommended. Didn't help much. A better index improved the situation a bit, but the query is still slow. I assume it's because for every order it has to do the sub-query.

This is the execution plan: https://www.brentozar.com/pastetheplan/?id=SyAlAU3db
For simplicity reasons I removed a few WHERE clauses in my example above)

The query runs on Azure SQL (and SQL Server Express 12 for development)

Anyone has a better idea on how to solve this?


Solution

  • If you have execution plan to share, it will help on performance analyse.

    I made some change on query as below, you may try if that improve it

    SELECT p1.company_id
    FROM  order o
    INNER JOIN person p1 
        ON (o.person_id = p1.id AND p1.company_id IS NOT NULL)
    GROUP BY p1.company_id
    HAVING SUM(CASE WHEN  (o.orderDate > '2017-01-01' AND  o.orderDate < '2017-09-01') THEN 1 ELSE 0 END) > 0
          AND
          SUM(CASE WHEN  orderDate < '2017-01-01' THEN 1 ELSE 0 END) = 0