Search code examples
sqldatabaseoraclebigdataexadata

Increase speed of Oracle query running on ExaData


I am working on an Oracle query and I badly need to make it go faster. I would greatly appreciate any advice.

  • The database is Oracle, running on an ExaData cluster.
  • Oracle version: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production

I have two tables.

1) Transactions: a purchase at a store - TransactionID

2) TransactionItems: each purchase has 1..many items - TransactionID, ItemID

In each table, there are two flags:

  • FlagA: Y/N
  • FlagB: Y/N

The query needs to:

  1. Set the value of FlagA and FlagB for every record in TransactionItem.
  2. Set the value of FlagA and FlagB for each row in Transaction, based on the values of the Flags in TransactionItem

I have broken my query into 4 steps.

  1. Set value of Flag A for TransactionItem
  2. Set value of Flag B for TransactionItem
  3. Set value of Flag A for Transaction
  4. Set value of Flag B for Transaction

The query runs smoothly. However, this is the catch. There are billions of Transaction records, and each Transaction has about 7 Transaction Items.

Here is how fast it goes now:

  • Total time: 616 seconds / 10.27 minutes
  • Processes 1,218 Transactions per second / 73,000 transactions’ per minute

I tracked the process time for each step:

  1. Set value of Flag A for TransactionItem

    • 4 minutes 52 seconds
  2. Set value of Flag B for TransactionItem

    • 3 minutes 26 seconds
  3. Set value of Flag A for Transaction

    • 1 minute 6 seconds
  4. Set value of Flag B for Transaction

    • 0 minutes 51 seconds

Below is my full query. Here are the other tables used

Product

  • Each TransactionItem has a ProductId Each product has a ProductCode.
  • One product code has many Products

FlagAproductCodes

  1. A single column with a list of ProductCodes that are categorized as FlagA

FlagBproductCodes

  1. A single column with a list of ProductCodes that are categorized as FlagB

TransactionPayment

  1. This is a fact table containing payment details for each transaction

Payment_Dim

  1. Links to TransactionPayment on PaymentID
  2. This is needed because FlagB is set based on Payment_Dim.PaymentName

I have these indexes:

Transactions 1. TransactionID

TransactionItems 1. TransactionID 2. ProductID

Product 1. ProductID 2. ProductCode

FlagAproductCodes 1. ProductCode

FlagBproductCodes 1. ProductCode

Payment 1. PaymentID 2. PaymentCode 3. Payment_Name

I really appreciate the help, thanks

-- 1.   Set value of FlagA for TransactionItem

Update
          TransactionItems  Item

Set FlagA = 
      (
          Select 
                    Case 


                    When  
                              Item.FlagA_Qty = 0 Then 'N' -- this is the quantity of items purchased that fall into the FlagA category

                    When 
                              FlagA.ProductCode Is Null Then 'N'  

                    Else
                               'Y'

                    End 

          From  
                    Product Prod 
                    Left Join 
                              FlagAproductCodes FlagA 
                                        On Product.ProductCode = FlagA.ProductCode   


          Where  
Product.Prod_Id = Item.Prod_Id                    

        )
;



-- 2.  Set value of FlagB for TransactionItem

Update TransactionItems

Set  FlagB = 'Y'

Where ItemID In 
          (
          Select 
                    Trans_Items.ItemID
          From 
                    TransactionItems Trans_Items
                    Inner Join Product Prod 
                              On Trans_Items.Prod_Id = Product.Prod_Id 
                    Inner Join FlagBproductCodes FlagB 

                              On Product.ProductCode = FlagB.ProductCode
          Where 
                    (
                              Trans_Items.Gov_FlagA_Qty < Trans_Items.Item_Qty
                    ) 
                    AND 
                    (
                              Exists 
                              (Select Transaction_Payment_Fid 
                              From TransactionPayment Trans_Pay

                                        Inner Join Warehouse.Payment_Dim Pay_Dim 
                                                  On Trans_Pay.Payment_Id = Pay_Dim.Payment_Id
                              Where 
                                        Transaction_Fid = Trans_Items.Transaction_Fid 
                                        And Upper(Pay_Dim.Payment_Name) Like '%ABC%'
                              )
                    )
          )          
; 
Update TransactionItems
Set FlagB = 'N'
Where FlagB Is Null;






-- 3: Set FlagA for Transactions


Update
          Transactions  

Set 
          Gov_FlagA_Flag =
                    Case When Exists 
                              (Select ItemID 
                              From TransactionItems Item 
                              Where Item.Transaction_Fid = Transactions.Transaction_Fid 
                              and gov_FlagA_flag = 'Y') 
                              Then 'Y'
                    Else 'N'
                    End
;


-- 4: Set FlagB for Transactions


Update
          Transactions  

Set 
          FlagB =
                    Case When Exists 
                              (Select ItemID 
                              From TransactionItems Item 
                              Where Item.Transaction_Fid = Transactions.Transaction_Fid 
                              And FlagB = 'Y') 
                              Then 'Y'
                    Else 'N'
                    End
;

Solution

  • You need to look into parallel execution, which is probably too broad a topic to fully explore here. (And I'm not qualified to say much about it).

    In the meantime you might get some benefit by only updating each table once, and reducing the number of incidental look-ups. This is untested but I think covers the same logic as your three updates against TransactionItems, for example:

    merge into TransactionItems TI
    using (
      select P.Prod_ID,
        case when FAPC.ProductCode is null then 'N' else 'Y' end as FlagA,
        case when FBPC.ProductCode is null then 'N' else 'Y' end as FlagB
      from Product P
      left join FlagAproductCodes FAPC on FAPC.ProductCode = P.ProductCode
      left join FlagAproductCodes FBPC on FBPC.ProductCode = P.ProductCode
    ) temp
    on (temp.Prod_id = TI.Prod_ID)
    when matched then
    update set TI.FlagA = case when temp.FlagA = 'Y' and TI.FlagA_Qty != 0
        then 'Y' else 'N' end,
      TI.FlagB = case when TI.FlagA_Qty < TI.Item_Qty
        and exists (
          select Transaction_Payment_Fid 
          from TransactionPayment TP
          join Payment_Dim PD on TP.Payment_Id = PD.Payment_Id
          where TP.Transaction_Fid = TI.Transaction_Fid 
          and upper(PD.Payment_Name) Like '%ABC%'
        ) then 'Y' else 'N' end
    /
    

    You might prefer to create an updatable view. But on that volume of data it's still going to take a long time.

    This might also be useful.