I am using HSQLDB 2.4.1 (embedded with cached tables). I have a large database (with ~21 million rows. DB Size is 5GB) I'm trying to run the following query:
UPDATE TABLE_NAME SET COLUMN1=0
I tried changing and playing around with theses properties but eventually this update statement is consuming huge amount of ram, as if it is copying the entire database into memory.
properties.setProperty("hsqldb.large_data" , "true");
properties.setProperty("hsqldb.log_data" , "false");
properties.setProperty("hsqldb.default_table_type" , "cached");
properties.setProperty("hsqldb.result_max_memory_rows" , "50000");
properties.setProperty("hsqldb.tx" , "mvcc");
properties.setProperty("sql.enforce_tdc_update" , "false");
properties.setProperty("shutdown" , "true");
properties.setProperty("runtime.gc_interval" , "100000");
When I execute this query in DBeaver, I notice that the memory consumption increases significantly and keeps increasing until it reaches the maxmem of 4GB at which point the application crashes with an out of memory error.
PS: Running this exact query on an identical embedded Derby database takes around 5 minutes, but eventually returns and memory usage in DBeaver remains constant at around ~400mb.
All the updated rows are loaded into memory until commit. You can update in chunks based on primary key and a LIMIT clause. For example:
UPDATE TABLE_NAME SET COLUMN1= 0 WHERE COLUMN1 != 0 AND (PK_ID > 1000000 AND PK_ID < 2000000) LIMIT 1000000
The above statement shows two different techniques to limit the rows. First, by using an indexed column to limit; Second by using LIMIT to stop updating after a number of rows have been updated.
http://hsqldb.org/doc/2.0/guide/dataaccess-chapt.html#dac_update_statement