(Sorry don't have enough rep-point to comment on the similar asked question)
When the apex_data_parser processes file of more than 50MB, execution time explodes. As in the question asked here: apex_data_parser file > 50MB, I must agree with oCoXa, there must be a bug in the parser.
I find exactly the same issue on ATP + APEX 19.2.0.00.18. I also see the spike in IO and CPU when parsing files >~50mb, no spike at all when below 50mb.
I have loaded two files to a table (A41_temp_files), one being 54mb, the other one 47mb. (same rows until 47mb) In both cases I'm just parsing 3000 lines of the respective 171.000 and 199.000 lines in the files.
I run the parser in SQL Developer, it take 1,9sec for 47mb and 88,6sec for 54mb.
select *
from A41_temp_files f,
table(
apex_data_parser.parse(
p_content => f.BLOB_CONTENT,
p_max_rows => 3000,
P_DETECT_DATA_TYPES => 'N',
p_file_name => f.FILENAME
)
) p
where 1=1
and f.filename = 'CHUCK_47mb.txt' --3000 lines 1,9sec
--and f.filename = 'CHUCK_54mb.txt' --3000 lines 88,6sec
order by col001
;
the APEX_DATA_PARSER package has indeed a 50MB "threshold".
For BLOBs smaller than 50MB, APEX creates a new, cached(!), temporary LOB, copies the data over and does all parsing on that temporary LOB. So all LOB operations happen in memory.
If the BLOB is larger than 50MB, no temporary LOB is created and the LOB operations of APEX_DATA_PARSER are performed on the LOB Locator which has been passed in. In your case, it's the LOB locator for your column A41_TEMP_FILES.BLOB_CONTENT. Thus all the LOB operations happen in disk, which makes it slower.
APEX is built to run on shared instances with many workspaces and parallel users as well - so this threshold is to protect the (shared) SGA and database memory.
However, developers working with the APEX_DATA_PARSER package can always create a cached temporary LOB themselves and pass that to APEX_DATA_PARSER.PARSE. For this use the DBMS_LOB.CREATETEMPORARY procedure with P_CACHE => TRUE to create the temporary LOB, then DBMS_LOB.COPY to copy the data from the uploaded BLOB to the temporary BLOB and then pass the temporary BLOB to APEX_DATA_PARSER.PARSE.
E.g.
create or replace function create_cached_blob( p_file in blob ) return blob
is
l_blob blob;
begin
dbms_lob.createtemporary( lob_loc => l_blob, cache => true );
dbms_lob.copy(
dest_lob => l_blob,
src_lob => p_file,
amount => dbms_lob.lobmaxsize);
return l_blob;
end create_cached_blob;
then use this function in APEX_DATA_PARSER as follows:
:
apex_data_parser.parse(
p_content => create_cached_blob( f.BLOB_CONTENT),
p_max_rows => 3000,
: