Search code examples
oracleexadata

Oracle/ExaData RowNum Range returns duplicate rows


I am executing this query on exadata in a for loop of a java program

select a, b, c, d from (
        select rownum r, a, b, c, d from foo order by c asc
) where r >= 40001 and r < 50001

Here i keep incrementing the numbers by 10000, so in the next loop iteration the numbers would be 50001 and 60001

I take the rows and insert into another database and loop again.

Currently, my code encounters random errors like

Exception in thread "main" java.sql.BatchUpdateException: Duplicate entry '[email protected]' for key 'PRIMARY'
    at com.mysql.jdbc.PreparedStatement.executeBatchSerially(PreparedStatement.java:1809)
    at 

When I check my source (exadata) database, there is only 1 row with a=23 and b = [email protected]. So the source doesn't have any duplication.

when I check my target database, I can see that the row a=23, b = [email protected] has already been inserted in a previous loop iteration. {{I drop and recreate the destination table at the start of the program.}}

So it looks like my window query is returning the same row again and again.

I did lot of search and I am pretty sure that my windowing query should not return duplicates... but it seems like it does

I am not an oracle/exadata expert... so let me know if there is a change that the query above can return the same row when run with different rownum ranges.


Solution

  • Use row_number() over() that's actually a window function. Since a and b seems to make the record unique, you should try.

    select a, b, c, d 
      from (
            select row_number() over (order by a,b) as r, 
                    a, b, c, d 
              from foo
            ) 
        where r >= 40001 
          and r < 50001;
    

    PS: Keep in mind that with this method there should not be any DML's over source table during load. PS2: rownum will never work in this case because is assigned before the order by. More info.