Search code examples
oracleplsqlbulkupdatebulk-collect

Strange behaviour of BULK COLLECT


I tinkered together the following PL/SQL BULK-COLLECT which works astonishingly fast for updates on huge tables (>50.000.000). The only problem is, that it does not perform the updates of the remaining < 5000 rows per table. 5000 is the given limit for the FETCH instruction:

DECLARE
-- source table cursor (only columns to be updated)
  CURSOR base_table_cur IS
  select a.rowid, TARGET_COLUMN from TARGET_TABLE a
                        where TARGET_COLUMN is null;

    TYPE base_type IS
        TABLE OF base_table_cur%rowtype INDEX BY PLS_INTEGER;
    base_tab  base_type;    

-- new data
    CURSOR new_data_cur IS
      select a.rowid,
        coalesce(b.SOURCE_COLUMN, 'FILL_VALUE'||a.JOIN_COLUMN) TARGET_COLUMN from TARGET_TABLE a 
        left outer join SOURCE_TABLE b
        on a.JOIN_COLUMN=b.JOIN_COLUMN
        where a.TARGET_COLUMN is null;

    TYPE new_data_type IS TABLE OF new_data_cur%rowtype INDEX BY PLS_INTEGER;
    new_data_tab       new_data_type;
    TYPE row_id_type IS TABLE OF ROWID INDEX BY PLS_INTEGER;
    row_id_tab         row_id_type;
    TYPE rt_update_cols IS RECORD (
        TARGET_COLUMN   TARGET_TABLE.TARGET_COLUMN%TYPE
    );
    TYPE update_cols_type IS
        TABLE OF rt_update_cols INDEX BY PLS_INTEGER;
    update_cols_tab    update_cols_type;
    dml_errors EXCEPTION;
    PRAGMA exception_init ( dml_errors,-24381 );

BEGIN
    OPEN base_table_cur;
    OPEN new_data_cur;
    LOOP
        FETCH base_table_cur BULK COLLECT INTO base_tab LIMIT 5000;
        IF base_table_cur%notfound THEN
           DBMS_OUTPUT.PUT_LINE('Nothing to update. Exiting.');
           EXIT; 
        END IF;
        FETCH new_data_cur BULK COLLECT INTO new_data_tab LIMIT 5000;
        FOR i IN base_tab.first..base_tab.last LOOP
            row_id_tab(i) := new_data_tab(i).rowid;
            update_cols_tab(i).TARGET_COLUMN := new_data_tab(i).TARGET_COLUMN;
        END LOOP;

        FORALL i IN base_tab.first..base_tab.last SAVE EXCEPTIONS
            UPDATE (SELECT TARGET_COLUMN FROM TARGET_TABLE)
            SET row = update_cols_tab(i)
            WHERE ROWID = row_id_tab(i);

        COMMIT;
        EXIT WHEN base_tab.count < 5000; -- changing to 1 didn't help!
    END LOOP;

    COMMIT;
    CLOSE base_table_cur;
    CLOSE new_data_cur;
EXCEPTION
    WHEN dml_errors THEN
        FOR i IN 1..SQL%bulk_exceptions.count LOOP
            dbms_output.put_line('Some error occured');
        END LOOP;
END;

Where is my mistake? It looks correct to me though.


Solution

  • The problem is this line:

    IF base_table_cur%notfound THEN
    

    The cursor meets %NOTFOUND when the number of records found is less than the LIMIT value. So if the last fetch is not exactly 5000 those records won't be processed.

    It's a common gotcha for people using BULK COLLECT ... LIMIT for the first time. The solution is to change the exit condition to

    EXIT when base_tab.count() = 0;
    

    "I need to ensure, that the base_table_cur is not empty and exit if it is. I'l get an error if it is empty"

    The new_data_cur cursor includes the table which is selected in base_table_cur cursor. So I don't think you need the two loops. You need a simple test to see whether the first cursor returns something, then just loop round the second cursor.

    I'm not entirely clear on your logic, so I have changed as little as possible to demonstrate the sort of structure I think you need. However, the UPDATE statement looks a little odd, so you may still run into issues.

    OPEN base_table_cur;
    FETCH base_table_cur BULK COLLECT INTO base_tab LIMIT 1;
    if base_table_tab.count = 0 then 
        DBMS_OUTPUT.PUT_LINE('Nothing to update. Exiting.');
    else
        OPEN new_data_cur;
        LOOP
            FETCH new_data_cur BULK COLLECT INTO new_data_tab LIMIT 5000;
            exit when new_data_tab.count() = 0;
    
            FOR i IN base_tab.first..base_tab.last LOOP
                row_id_tab(i) := new_data_tab(i).rowid;
                update_cols_tab(i).TARGET_COLUMN := new_data_tab(i).TARGET_COLUMN;
            END LOOP;
    
            FORALL i IN base_tab.first..base_tab.last SAVE EXCEPTIONS
                UPDATE (SELECT TARGET_COLUMN FROM TARGET_TABLE)
                SET row = update_cols_tab(i)
                WHERE ROWID = row_id_tab(i);
    
        END LOOP;
        CLOSE new_data_cur;
    end if;
    COMMIT;
    CLOSE base_table_cur;