Search code examples
oracle-databaseapache-nifi

Very slow UPDATE with PutDatabaseRecord processor


I have 2 PutDatabaseRecord which are writing to Oracle DB.

This is part of the schema:

Schema

The red square there ins't error, but information messages, thus I put the processor to INFO mode. So it tells about fetching the schema, and "Commit because of batch size" with SQL update command.

The problem is, that first of them: ml_task, processes several millions records in few minutes, but the other one: ml_sales - stack with 1000 records several hours... This stacks both at night, when DB is much loaded and at a day time.

In a same moment, update goods.ml_[name] set load_date = sysdate statements takes same time via the SQLDeveloper interface - both for ml_task and ml_sales. It takes around of 10 minutes as night and several minutes at a day.

Both of them working with same pooling service.

This is the configuration of the bottom part of the service:

Pooling config

Both of the processors have same configuration, except the table name and update keys.

Processors' config

I tried set Max Batch Size to zero, it have no influence.

Both of the processors configured run in one thread,but I tried to configure 10 threads - there is no influence.

Also, there isn't lack of connections to the DB at all, I have around of 10 processors, every one uses one thread, so 50 connections, I think is enough.

There are around 5 millions record they are processing.

This is JSON of ml_task:

[{"DOC_ID": 1799041400,"LINE_D":694098344,"LOAD_DATE":"16-Jul-21"} ... ]

This is something similar to Nifi's update for ml_task:

update goods.ml_task
set load_date = sysdate
where doc_id = ? 
and line_id = ?;

This is the table:

Name          Null?    Type   
------------- -------- ------ 
DOC_ID        NOT NULL NUMBER 
LINE_ID       NOT NULL NUMBER 
ORG_ID        NOT NULL NUMBER 
NMCL_ID       NOT NULL NUMBER 
ASSORTMENT_ID NOT NULL NUMBER 
START_DATE    NOT NULL DATE   
END_DATE               DATE   
ITEMS_QNT              NUMBER 
MODIFY_DATE            DATE   
LOAD_DATE              DATE   

This is a JSON of ml_sales:

[{"REP_DATE":"06-Jul-21","NMCL_ID":336793,"ASSORT_ID":7,"RTT_ID":92,"LOAD_DATE":"16-Jul-21"} ... ]

Request for ml_sales is such as:

update goods.ml_sales set load_date = sysdate
where nmcl_id = ?
and assort_id = ?
and rtt_id = ?
and rep_date = ?;

And the table:

Name        Null?    Type       
----------- -------- ---------- 
REP_DATE    NOT NULL DATE       
NMCL_ID     NOT NULL NUMBER(38) 
ASSORT_ID   NOT NULL NUMBER     
RTT_ID      NOT NULL NUMBER(38) 
OUT_ITEMS            NUMBER     
MODIFY_DATE          DATE       
LOAD_DATE            DATE  

What can be reason for so slow update of ml_sales?

UPDATE

I put all the circuits to STOP except the problematic one, I committed all sessions in SQLDeveloper... and it is the same result... very long...

UPDATE

As I mentioned above, I actually copied the ml_sales table to other scheme and this didn't influenced the result.


Solution

  • I guess there were problem in lack of indexes in Oracle. I asked our DBA's to check the problem in DB side, they fixed it, now the UPDATE works as fast as for ml_task table, but unfortunately I still don't know what exactly was the problem, thus out DBA left for vocation. So, again, very possible that it were indexes problem.