Search code examples
sqloracleperformancedb2db2-zos

Oracle slow to join local table to DB2 Mainframe


This one is puzzling me. Pulling from the DB2 by itself is fast, and pulling from the table is fast, but I don't know why they don't play nicely together. I don't have access to the indices of the DB2 table or the server there.

This query takes 0.017 seconds:

select
    PART_NO, 
    APRV_DT, 
    round((CURRENT_DATE - APRV_DT)/365.242199,1) as AGE,
    rank() over (partition by PART_NO order by APRV_DT asc) rnk,
    FROM DB2_TABLE
where PART_NO in 
    ('529711',
    '627862',
    '325712',
    '979257',
    '168570',
    '004297')

Obviously I don't want to hard-code all the part numbers because I have almost 200k of them to query.

I left the part numbers in here just to try and get this working. This query where I select the same 6 part numbers takes 1.23 seconds:

select distinct PART_NUMBER from PARTS_REPORT
where PART_NUMBER in 
    ('529711',
    '627862',
    '325712',
    '979257',
    '168570',
    '004297')

The issue is when I combine these together:

In my mind, this query should take about 3 seconds or something. It takes 492 Seconds.

select
    PART_NO, 
    APRV_DT, 
    round((CURRENT_DATE - APRV_DT)/365.242199,1) as AGE,
    rank() over (partition by PART_NO order by APRV_DT asc) rnk,
    FROM DB2_TABLE
where PART_NO in 
(
  select distinct PART_NUMBER from PARTS_REPORT
   where PART_NUMBER in 
       ('529711',
       '627862',
       '325712',
       '979257',
       '168570',
       '004297')
  )

Is there a better way to do this? Do I need to index my PARTS_REPORT table? What's the key here?

edit: to run all 200k-ish part numbers, the same query takes 564 seconds - around the time it takes what I have above to run.

Edit 2: the user below helped me to know what was going on - I have to pull down the whole remote table, and that's slow. I think I understand what's happening now - thanks.


Solution

  • Summarizing my comments as an answer:

    In your first query you are providing an explicit IN list to the query, so it executes remotely on the DB2 server in its entirety, returning of data from DB2_TABLE.

    When you attempt to retrieve the search criteria from a local table (which you do with where PART_NO in) you force a join of a remote and a local table, for which the entire remote table has to be sent to the local server where the join is performed.

    Sending the local table (or subset thereof) to the remote server to perform the join there presumably requires less bandwidth. You could achieve this by declaring a temporary table, loading it with the list of part numbers from the Oracle table, then performing your query against two remote tables, localizing the join there.

    You already have some privileges in the remote database, which allow you to query its table(s); try and see if you can run DECLARE GLOBAL TEMPORARY TABLE -- by default it does not require any privileges beyond normal PUBLIC privileges.