Search code examples
javasql-servermultithreadingsql-delete

Java Multithreaded delete on same sets of table


I had to cleanup database ( few tables with given condition , where columns for conditions are always same ) e.g.

delete from table1 where date < given_date1 and id = given_id
delete from table2 where date < given_date2 and id = given_id

Where given_id and givendate relation varies on both table by table and id by id.

The actual delete condition is not always where date < givendate , I just wrote for example, so say one id has got 300 days of data, and other of 500 days of data, the where condition is allowed to delete oldes 10 days of data where 10 is a variable, based on user input, so at one iteration all nodes are processed with deleting oldest 10 days of data and thus query changes for each id, but the fact is that it would be on same sets of table

earlier that script was written in as sql script and doing its operation but was taking time, Now I have implemented a multithreaded java application where the new code looks like

for(i=0; i < idcount ; i++)
{
   //launch new thread and against that thread call
   delete(date,currentid);
}

function delete(date,id)
{
    delete from table1 where date < given_date and id = given_id
    delete from table2 where date < given_date and id = given_id

}

after implementing this I found deadlock on sql table, which was solved by indexing the tables, but still its not fast as it is supposed to be, If I have 500 threads they are all launched one after other, and obviously running on same sets of table. and sql is not actually executing in parallel on each table ?

When I monitor my java.exe and sqlserver.exe, its not busy at all ? I hope it is supposed to be.

Could anyone tell me what could be best approach to implement multithreaded delete on same sets of table, so that I can bump up the thread and do deletion in parallel and consume available resources


Solution

  • If all the actions are delete on a given id the I would just do a delete on each table doing all the ids at once.

    e.g.

    delete from table1 where date < given_date and id in (given_id1, given_id2 ..... )
    

    If there are lots of given_ids the first insert them into a temporary table then execute each delete by joining the table to have deletions with the temporary table

    Also if trying to use multiple threads then the improvement is really only expected if you act on a table in a thread so there will not be contention in the database.