Search code examples
multithreadingperformanceoracle-databaserownum

Slow inner-N selection in Oracle


I've come up with a problem lately. We wanted to get a table of data from an Oracle DB in batches, where a batch is for example 4-5000. My previous "general" solution was to wrap the select into some kind of following thing:

SELECT * FROM (
    SELECT ROWNUM AS RN, INNERSELECT.* FROM (
        select THINGS1, THINGS2 from MYTABLE)
    )
INNERSELECT WHERE ROWNUM < 4001) WHERE RN >= 3001

Seemed to be slow as when the rownum borders are high (for example 4-500.000) then a select will take more than 10-20 seconds. And the time is increasing linearily as I increase the values. Of course I know why, so I wanted to find another solution to get data in batch but with a guarantied time frame. Well then I used this kind of select:

SELECT * FROM (
  SELECT INNERSELECT.*, ROW_NUMBER() OVER (ORDER BY INNERSELECT.ROWID) RN FROM 
    ( select THINGS1, THINGS2 from MYTABLE)
    INNERSELECT
) WHERE RN BETWEEN 3001 AND 4000;

Which is a bit slow (It took 3-4 seconds to get a batch of data for a test), but I could increase the batch size without noticable performance drop and it will be the same for high values. The current problem is that I select from several tables in multiple threads from the DB and the performance drops to hell if I just use 3 threads (and we are not talking about the 16 which is currently used for a project). The JDBC connections are pooled, the performance drop is on the server side. I think it is obvious that the select is eating the CPU time so I cannot do anything, but I hope that you all have some hints or tips how to optimize this.

Other problem is that the tool we use is not project specific, it has to be general as possible so I cannot use copy tables for example, etc.

Thanks for the answers in advance


Solution

  • Assuming that MYTABLE has a numeric primary key ID, one tactic would be to have each of the 16 threads execute this query:

    SELECT THINGS1, THINGS2
    FROM   MYTABLE
    WHERE  MOD(ID,15) = :THREAD_NUMBER
    

    Each thread would get a unique value for :THREAD_NUMBER from 0 to 15. That means, each thread will get (approximately) 1/16th of the rows.