Search code examples
sql-servert-sqlcursor

How could I replace a T-SQL cursor?


I would like to ask you how I could replace a cursor that I've inserted into my stored procedure.

Actually, we found that cursor is the only way out to manage my scenario, but as I've read this is not a best practise.

This is my scenario:I have to calculate recursively the stock row by row and set the season according to what has been calculated in the previous rows.

I can set the season when the transfer type is "purchase". The others transfers should be set with the correct season by a T-SQL query.

The table where I should calculate the season has the following template and fake data, but they reflect the real situation:

Transfer Table Example

enter image description here

The rows that have the "FlgSeason" set as null, are calculated as follow: in ascending order, the cursor start from the row 3 and go back the previous rows and calculate the amount of stock for each season and then update the column season with the minimum season with stock.

Here's the code I used:

CREATE TABLE [dbo].[transfers]
(
    [rowId] [int] NULL,
    [area] [int] NULL,
    [store] [int] NULL,
    [item] [int] NULL,
    [date] [date] NULL,
    [type] [nvarchar](50) NULL,
    [qty] [int] NULL,
    [season] [nvarchar](50) NULL,
    [FlagSeason] [int] NULL
) ON [PRIMARY]

INSERT INTO [dbo].[transfers]
           ([rowId]
           ,[area]
           ,[store]
           ,[item]
           ,[date]
           ,[type]
           ,[qty]
           ,[season]
           ,[FlagSeason])
      VALUES (1,1,20,300,'2015-01-01','Purchase',3,'2015-FallWinter',1)
     , (2,1,20,300,'2015-01-01','Purchase',4,'2016-SpringSummer',1)
     ,  (3,1,20,300,'2015-01-01','Sales',-1,null,null)
     ,  (4,1,20,300,'2015-01-01','Sales',-2,null,null)
     ,  (5,1,20,300,'2015-01-01','Sales',-1,null,null)
     ,  (6,1,20,300,'2015-01-01','Sales',-1,null,null)
     ,  (7,1,20,300,'2015-01-01','Purchase',4,'2016-FallWinter',1)
     ,  (8,1,20,300,'2015-01-01','Sales',-1,null,null)

DECLARE @RowId as int
DECLARE db_cursor CURSOR FOR 
Select    RowID  
from Transfers
where [FlagSeason] is null
order by RowID

OPEN db_cursor   
FETCH NEXT FROM db_cursor INTO @RowId   

WHILE @@FETCH_STATUS = 0   
BEGIN   


Update Transfers
set Season = (Select min  (Season) as Season
                      from (
                          Select 
                            Season
                          , SUM(QTY)  as Qty 
                          from Transfers
                          where RowID < @RowId
                            and [FlagSeason] = 1
                          group by Season
                          having Sum(QTY) > 0  
                          )S
                          where s.QTY >= 0 
                     )
, [FlagSeason] = 1

where rowId = @RowId

       FETCH NEXT FROM db_cursor INTO @RowId   

    end

In this case the query would extract:

  • 3 qty for season 2015 FW
  • 4 for 2016 SS.

Than The update statment will set 2015-fw (the min over the two season with qty).

Then the courson go forward the row 4, and runs again the query to extract the stock updated considering the calculation at row 3. So the result should be

  • QTY 2 For 2015 FW
  • QTY 4 FOr 2016 SS

and then the update would set 2015 FW. And so on.

The final output should be something like this:

Output

enter image description here

Actually, the only way-out was to implement a cursor and now it takes above 30/40 minutes to scan and update about 2,5 million rows. Do anybody know a solution without recurring to a cursor?

Thanks in advance!


Solution

  • Updated to run on 2008

    IF OBJECT_ID('tempdb..#transfer') IS NOT NULL
      DROP TABLE #transfer;
    GO
    
    CREATE TABLE #transfer (
                             RowID INT IDENTITY(1, 1) PRIMARY KEY NOT NULL,
                             Area INT,
                             Store INT,
                             Item INT,
                             Date DATE,
                             Type VARCHAR(50),
                             Qty INT,
                             Season VARCHAR(50),
                             FlagSeason INT
                           );
    
    INSERT INTO #transfer ( Area,
                            Store,
                            Item,
                            Date,
                            Type,
                            Qty,
                            Season,
                            FlagSeason
                          )
    VALUES (1, 20, 300, '20150101', 'Purchase', 3, '2015-SpringSummer', 1),
    (1, 20, 300, '20150601', 'Purchase', 4, '2016-SpringSummer', 1),
    (1, 20, 300, '20150701', 'Sales', -1, NULL, NULL),
    (1, 20, 300, '20150721', 'Sales', -2, NULL, NULL),
    (1, 20, 300, '20150901', 'Sales', -1, NULL, NULL),
    (1, 20, 300, '20160101', 'Sales', -1, NULL, NULL),
    (1, 20, 300, '20170101', 'Purchase', 4, '2017-SpringSummer', 1),
    (1, 20, 300, '20170125', 'Sales', -1, NULL, NULL),
    (1, 20, 300, '20170201', 'Sales', -1, NULL, NULL),
    (1, 20, 300, '20170225', 'Sales', -1, NULL, NULL),
    (1, 21, 301, '20150801', 'Purchase', 4, '2017-SpringSummer', 1),
    (1, 21, 301, '20150901', 'Sales', -1, NULL, NULL),
    (1, 21, 301, '20151221', 'Sales', -2, NULL, NULL),
    (1, 21, 302, '20150801', 'Purchase', 1, '2016-SpringSummer', 1),
    (1, 21, 302, '20150901', 'Purchase', 1, '2017-SpringSummer', 1),
    (1, 21, 302, '20151101', 'Sales', -1, NULL, NULL),
    (1, 21, 302, '20151221', 'Sales', -1, NULL, NULL),
    (1, 20, 302, '20150801', 'Purchase', 1, '2016-SpringSummer', 1),
    (1, 20, 302, '20150901', 'Purchase', 1, '2017-SpringSummer', 1),
    (1, 20, 302, '20151101', 'Sales', -1, NULL, NULL),
    (1, 20, 302, '20151221', 'Sales', -1, NULL, NULL);
    WITH Purchases
    AS (SELECT  t1.RowID,
                t1.Area,
                t1.Store,
                t1.Item,
                t1.Date,
                t1.Type,
                t1.Qty,
                t1.Season,
                RunningInventory = ( SELECT SUM(t2.Qty)
                                     FROM   #transfer AS t2
                                     WHERE  t1.Type = t2.Type
                                            AND t1.Area = t2.Area
                                            AND t1.Store = t2.Store
                                            AND t1.Item = t2.Item
                                            AND t2.Date <= t1.Date
                                   )
        FROM    #transfer AS t1
        WHERE   t1.Type = 'Purchase'
       ),
         Sales
    AS (SELECT  t1.RowID,
                t1.Area,
                t1.Store,
                t1.Item,
                t1.Date,
                t1.Type,
                t1.Qty,
                t1.Season,
                RunningSales = ( SELECT SUM(ABS(t2.Qty))
                                 FROM   #transfer AS t2
                                 WHERE  t1.Type = t2.Type
                                        AND t1.Area = t2.Area
                                        AND t1.Store = t2.Store
                                        AND t1.Item = t2.Item
                                        AND t2.Date <= t1.Date
                               )
        FROM    #transfer AS t1
        WHERE   t1.Type = 'Sales'
       )
    SELECT  Sales.RowID,
            Sales.Area,
            Sales.Store,
            Sales.Item,
            Sales.Date,
            Sales.Type,
            Sales.Qty,
            Season = ( SELECT TOP 1
                            Purchases.Season
                       FROM Purchases
                       WHERE Purchases.Area = Sales.Area
                             AND Purchases.Store = Sales.Store
                             AND Purchases.Item = Sales.Item
                             AND Purchases.RunningInventory >= Sales.RunningSales
                       ORDER BY Purchases.Date, Purchases.Season
                     )
    FROM    Sales
    UNION ALL
    SELECT  Purchases.RowID ,
            Purchases.Area ,
            Purchases.Store ,
            Purchases.Item ,
            Purchases.Date ,
            Purchases.Type ,
            Purchases.Qty ,
            Purchases.Season 
    FROM    Purchases
    ORDER BY Sales.Area, Sales.Store, item, Sales.Date
    

    *original answer below**

    I don't understand the purpose of the flagseason column so I didn't include that. Essentially, this calculates a running sum for purchases and sales and then finds the season that has a purchase_to_date inventory of at least the sales_to_date outflow for each sales transaction.

    IF OBJECT_ID('tempdb..#transfer') IS NOT NULL
      DROP TABLE #transfer;
    GO
    
    CREATE TABLE #transfer (
                             RowID INT IDENTITY(1, 1) PRIMARY KEY NOT NULL,
                             Area INT,
                             Store INT,
                             Item INT,
                             Date DATE,
                             Type VARCHAR(50),
                             Qty INT,
                             Season VARCHAR(50),
                             FlagSeason INT
                           );
    
    INSERT INTO #transfer ( Area,
                            Store,
                            Item,
                            Date,
                            Type,
                            Qty,
                            Season,
                            FlagSeason
                          )
    VALUES (1, 20, 300, '20150101', 'Purchase', 3, '2015-FallWinter', 1),
    (1, 20, 300, '20150601', 'Purchase', 4, '2016-SpringSummer', 1),
    (1, 20, 300, '20150701', 'Sales', -1, NULL, NULL),
    (1, 20, 300, '20150721', 'Sales', -2, NULL, NULL),
    (1, 20, 300, '20150901', 'Sales', -1, NULL, NULL),
    (1, 20, 300, '20160101', 'Sales', -1, NULL, NULL),
    (1, 20, 300, '20170101', 'Purchase', 4, '2016-FallWinter', 1),
    (1, 20, 300, '20170201', 'Sales', -1, NULL, NULL);
    
    WITH Inventory
    AS (SELECT  *,
                PurchaseToDate = SUM(CASE WHEN Type = 'Purchase' THEN Qty ELSE 0 END) OVER (ORDER BY Date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW),
                SalesToDate = ABS(SUM(CASE WHEN Type = 'Sales' THEN Qty ELSE 0 END) OVER (ORDER BY Date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW))
        FROM    #transfer
       )
    SELECT  Inventory.RowID,
            Inventory.Area,
            Inventory.Store,
            Inventory.Item,
            Inventory.Date,
            Inventory.Type,
            Inventory.Qty,
            Season = CASE
                       WHEN Inventory.Season IS NULL
                         THEN ( SELECT TOP 1
                                    PurchaseToSales.Season
                                FROM    Inventory AS PurchaseToSales
                                WHERE   PurchaseToSales.PurchaseToDate >= Inventory.SalesToDate
                                ORDER BY Inventory.Date
                              )
                       ELSE
                         Inventory.Season
                     END,
            Inventory.PurchaseToDate,
            Inventory.SalesToDate
    FROM    Inventory;
    

    *UPDATED*******************************

    You'll need an index on your data to help with the sorting in order to make this perform.

    Possibly:

    CREATE  NONCLUSTERED INDEX IX_Transfer ON #transfer(Store, Item, Date) INCLUDE(Area,Qty,Season,Type)
    

    You should see a index scan on the named index. It will not be a seek because the sample query does not filter any data and all of the data is included.

    In addition, you need to remove Season from the Partition By clause of the SalesToDate. Resetting the sales for each season will throw your comparisons off because the rolling sales need to be compared to the rolling inventory in order for you to determine the source of sales inventory.

    Two other tips for the partition clause:

    1. Don't duplicate the fields between partition by and order by. The order of the partition fields doesn't matter since the aggregate is reset for each partition. At best, the ordered partition field will be ignored, at worst it may cause the optimizer to aggregate the fields in a particular order. This does not have any effect on the results, but can added unnecessary overhead.

    2. Make sure your index matches the definition of the partition by/order by clause.

    The index should be [partitioning fields, sequence doesn't matter] + [ordering fields, sequence needs to match order by clause]. In your scenario, the indexed columns should be on store, item, and then date. If date were before store or item, the index would not be used because the optimizer will need to first handle partitioning by store & item before sorting by date.

    If you may have multiple areas in your data, the index and partition clauses would need to be

    index: area, store, item, date

    partition by: area, store, item order by date