Search code examples
sqlsql-serverquery-optimization

Optimizing SQL join query with date comparison


I have a query that takes around 2 seconds to fetch 16,900 rows:

SELECT x.lid
FROM schema1.view1 x
INNER JOIN schema1.view2 y
    ON x.cid = y.cid
        and datediff(day, x.indt, y.linvcy)=0    -- problematic line
        and x.indt = y.indt                      -- alternative line I've tried

But if I add time comparison (last line) to the join logic, the query takes over an hour to execute (I've never actually been able to run it to completion).

Each of the views themselves contain join logic so the complexity of the query plan is quite high. I am looking for some pointers to help optimize this query.

schema1.view1:

select
    D.lid,
    D.cid,
    D.indt,
    lower(trim(substring(C.bse, 1, charindex('-', C.bse)-1))) as bse
from (
    select
        B.lid,
        B.cid,
        B.indt
    from ds.v A
    inner join (
        select
            x.lid,
            x.cid,
            x.ivid,
            x.clior,
            x.sosy,
            x.indt
        from rd.ui x
        where
            x.clior like 'abc'
    ) B
    on
        B.ivid like A.ivn
        and B.clior like A.clior
) D
left join ctlg.brchs C
on
    C.paor like D.clior
    and C.sosy like D.sosy
    and C.bid like D.cid

rd.ui:

create table rd.ui (
    id int identity(1,1) primary key,
    lid varchar(256),
    cid varchar(256),
    ivid varchar(256),
    indt date,
    clior varchar(256),
    sosy varchar(256)
)

create unique index idx_unlid
on rd.ui(lid)

create index idx_sebid
on rd.ui(bid)

create index idx_seindt
on rd.ui(indt)

create index idx_sec
on rd.ui(clior)

ctlg.brchs:

create table ctlg.brchs (
    id int identity(1,1) primary key,
    paor varchar(256),
    sosy varchar(256),
    bid varchar(256)
)
create unique index idx_unbr
on ctlg.brchs (paor, sosy, bid)

schema1.view2:

SELECT 
    A.cid
    , A.cinvcy
    , B.linvcy
FROM (
    SELECT
        x.cid
        , MAX(x.indt) AS cinvcy
    FROM schema1.view1 x
    GROUP BY
        x.cid
) A
INNER JOIN (
    SELECT
        z.cid
        , MAX(z.indt) AS linvcy
    FROM (
        SELECT
            x.cid
            , x.indt
        FROM schema1.view1 x
        WHERE CONCAT(x.cid, x.ind) NOT IN (
            SELECT CONCAT(y.cid, MAX(y.indt))
            FROM schema1.view1 y
            GROUP BY y.cid 
        )
    ) z
    GROUP BY
        z.cid
) B
ON A.cid=B.cid

I tried creating indexed views using with schemabinding, but those views contain CTEs and derived tables (select * from (select * from x)) which means I can't create indexes on those views.

UPDATE: Changing like to = significantly improved the speed of that part of the query, but the rest is still extremely slow.


Solution

  • In schema1.view2 this CONCAT matching is weird.

    SELECT
        x.cid
        , x.indt
    FROM schema1.view1 x
    WHERE CONCAT(x.cid, x.ind) NOT IN (
        SELECT CONCAT(y.cid, MAX(y.indt))
        FROM schema1.view1 y
        GROUP BY y.cid
    

    My understanding is that this query is trying to exclude "last value" (last date?) Try changing it to:

    SELECT x.cid, x.indt
    FROM(
        SELECT x.cid, x.indt, ROW_NUMBER() OVER( PARTITION BY cid ORDER BY indt DESC ) AS Maxindt
        FROM schema1.view1 x
         ) AS y
    WHERE Maxindt <> 1
    

    Try adding the following index to rd.ui table

    create index idx_PickYourName on rd.ui(ivid , clior)

    As suggested by @siggemannen in the comments please replace all LIKE matches with = where possible.