Search code examples
sqlsql-servernativequery

How can I improve the native query for a table with 7 millions rows?


I have the below view(table) in my database(SQL SERVER).

enter image description here

I want to retrieve 2 things from this table.

  • The object which has the latest booking date for each Product number. It will return the objects = {0001, 2, 2019-06-06 10:39:58} and {0003, 2, 2019-06-07 12:39:58}.
  • If all the step number has no booking date for a Product number, it wil return the object with Step number = 1. It will return the object = {0002, 1, NULL}.

The view has 7.000.000 rows. I must do it by using native query.

The first query that retrieves the product with the latest booking date:

SELECT DISTINCT *
FROM TABLE t
WHERE t.BOOKING_DATE = (SELECT max(tbl.BOOKING_DATE) FROM TABLE tbl WHERE t.PRODUCT_NUMBER = tbl.PRODUCT_NUMBER)

The second query that retrieves the product with booking date NULL and Step number = 1;

SELECT DISTINCT *
FROM TABLE t
WHERE (SELECT max(tbl.BOOKING_DATE) FROM TABLE tbl WHERE t.PRODUCT_NUMBER = tbl.PRODUCT_NUMBER) IS NULL AND t.STEP_NUMBER = 1

I tried using a single query, but it takes too long. For now I use 2 query for getting this information but for the future I need to improve this. Do you have an alternative? I also can not use stored procedure, function inside SQL SERVER. I must do it with native query from Java.


Solution

  • Try this,

    Declare @p table(pumber int,step int,bookdate datetime)
    insert into @p values 
    (1,1,'2019-01-01'),(1,2,'2019-01-02'),(1,3,'2019-01-03')
    ,(2,1,null),(2,2,null),(2,3,null)
    ,(3,1,null),(3,2,null),(3,3,'2019-01-03')
    
    ;With CTE as
    (
    select pumber,max(bookdate)bookdate 
    from @p p1 
    where bookdate is not null
    group by pumber
    )
    
    select p.* from @p p
    where exists(select 1 from CTE c 
    where p.pumber=c.pumber and p.bookdate=c.bookdate)
    union all
    select p1.* from @p p1
    where p1.bookdate is null and step=1
    and not exists(select 1 from CTE c 
    where p1.pumber=c.pumber)
    

    If performance is main concern then 1 or 2 query do not matter,finally performance matter.

    Create NonClustered index ix_Product on Product (ProductNumber,BookingDate,Stepnumber)
    Go
    

    If more than 90% of data are where BookingDate is not null or where BookingDate is null then you can create Filtered Index on it.

     Create NonClustered index ix_Product on Product (ProductNumber,BookingDate,Stepnumber)
    where BookingDate is not null
     Go