Search code examples
sqloracle-databaseoracle11grecursive-querydatabase-link

force Oracle to process recursive CTE on remote db site (perhaps using DRIVING_SITE hint)


I am trying to fetch data from remote table. The data is expanded from seed set of data in local table using recursive CTE. The query is very slow (300 seed rows to 800 final rows takes 7 minutes).

For other "tiny local, huge remote"-cases with no recursive query the DRIVING_SITE hint works excellently. I also tried to export seed set from local table into auxiliary table on remotedb with same structure and - being logged in remotedb - ran query as pure local query (my_table as p, my_table_seed_copy as i). It took 4s, which encouraged me to believe forcing query to remote site would make query fast.

What's the correct way to force Oracle to execute recursive query on the remote site?

with s (id, data) as (
  select p.id, p.data
  from my_table@remotedb p
  where p.id in (select i.id from my_table i)
  union all
  select p.id, p.data
  from s
  join my_table@remotedb p on ...
)
select /*+DRIVING_SITE(p)*/ s.*
from s;

In the query above, I tried

  • select /*+DRIVING_SITE(p)*/ s.* in main select
  • select /*+DRIVING_SITE(s)*/ s.* in main select
  • omitting DRIVING_SITE in whole query
  • select /*+DRIVING_SITE(x)*/ s.* from s, dual@remotedb x as main select
  • select /*+DRIVING_SITE(p)*/ p.id, p.data in first inner select
  • select /*+DRIVING_SITE(p)*/ p.id, p.data in both inner selects
  • select /*+DRIVING_SITE(p) MATERIALIZE*/ p.id, p.data in both inner selects
  • (just for completeness - rewriting to connect by is not applicable for this case - actually the query is more complex and uses constructs which cannot be expressed by connect by)

All without success (i.e. data returned after 7 minutes).


Solution

  • Recursive query actually performs breadth-first search - seed rows represent 0-th level and recursive part finds element on n-th level from elements on (n-1)-th level. Original query was intended to be part of merge ... using ... clause.

    Hence I rewrote query to PLSQL loop. Every cycle generates one level. Merge prevents insertion of duplicates so finally no new row is added and loop exits (transitive closure is constructed). Pseudocode:

    loop
      merge into my_table using (
        select /*+DRIVING_SITE(r)*/ distinct r.* /*###BULKCOLLECT###*/
        from my_table          l
        join my_table@remotedb r on ...  -- same condition as s and p in original question are joined on
      ) ...
      exit when rows_inserted = 0;
    end loop;
    

    Actual code is not so simple since DRIVING_SITE actually does not directly work with merge so we have to transfer data via work collection but that's different story. Also the count of inserted rows cannot be easily determined, it must be computed as difference between row count after and before merge.

    The solution is not ideal. Anyway it's much faster than recursive CTE (30s, 13 cycles) because queries are provably utilizing the DRIVING_SITE hint.

    I will leave question open for some time to wait if somebody finds answer how to make recursive query working or proving it is not possible.