Search code examples
sqlsql-server-2008-r2recursive-querycross-apply

SQL Server Cross Apply on itself + Cross Apply performance issues


I have the following:

CREATE TABLE #MASTER_POSANDTRANS (ID CHAR(30), 
                                  QUANTITY INT, 
                                  START_OR_TRADE_DATE DATE) 

VALUES 
    ('APPLES ','150000','20150501'), ('PEARS','220000','20150501'), 
    ('APPLES ','-75000','20150506'), ('APPLES ','-65000','20150508'), 
    ('APPLES ','10000','20150516'),  ('APPLES ','-20000','20150519'
    ('PEARS','-110000','20150506'),  ('PEARS','-100000','20150519')

I want to be able to say "Ok at the beginning I had 150,000 apples, therefore my overall apple position was LONG"

Here is how I am currently solving this little puzzle:

SELECT 
    MPT.ID, 
    MPT.QUANTITY, 
    MPT.START_OR_TRADE_DATE, 
    APPLY_RES.QUANTITY as 'FIRST AVAILABLE  QUANTITY', 
    CASE 
       WHEN APPLY_RES.QUANTITY > 0 
         THEN 'LONG' 
         ELSE 'SHORT' 
    END as 'L/S Indicator'
FROM 
    #MASTER_POSANDTRANS as MPT 
CROSS APPLY
    (SELECT TOP 1 
         MPT_APPLY.QUANTITY 
     FROM 
         #MASTER_POSANDTRANS as MPT_APPLY
     WHERE 
         MPT_APPLY.ID = MPT.ID) APPLY_RES

The table of provided is, a simplification of the result of multiple queries and union alls that spit out about 387 rows in roughly 3 seconds. However, when I try and apply my solution to this 387 rows result, my query returns the correct result but takes 21 seconds instead of 3.

Any suggestions for improvement?

Little more to the puzzle:

  1. The first available "transaction" decides if I am long or short

  2. Another, more complex problem (for those who like puzzles) is to take an average weighted holding to decide if I was long or short during the period. So for example, imagine that after the last apples transaction on the 19th, where I had 0 apples left, I sold another 1million apples. This would mean that from the 19th to the 31st of May I was short 1 million. (31-19) * -1,000,000 < (19-1)*(Average of apples sold/bought during the month) therefore my holding of apples during that month could be considered as short. If at any point the two balanced, I would take the value in point 1.


Solution

  • You Cross Apply Select is missing an Order By (date?). It is not deterministic and there is no guarantee that you will indeed get the 1st record.

    On you sample data, this query has a better query plan:

    SELECT
        MPT.ID,
        MPT.QUANTITY,
        MPT.START_OR_TRADE_DATE
        , APPLY_RES.QUANTITY as 'FIRST AVAILABLE  QUANTITY',
        CASE
        WHEN APPLY_RES.QUANTITY > 0
            THEN 'LONG'
        ELSE 'SHORT'
        END as 'L/S Indicator'
    From MASTER_POSANDTRANS AS MPT
    Inner Join (
        Select ID, QUANTITY, N = ROW_NUMBER() over (Partition by ID order by START_OR_TRADE_DATE)
        From MASTER_POSANDTRANS
    ) as APPLY_RES on APPLY_RES.ID = MPT.ID
    Where APPLY_RES.N = 1
    --Order By MPT.ID, MPT.START_OR_TRADE_DATE