Search code examples
javajdbcsqltransaction

Do not update row in ResultSet if data has changed


we are extracting data from various database types (Oracle, MySQL, SQL-Server, ...). Once it is successfully written to a file we want to mark it as transmitted, so we update a specific column.

Our problem is, that a user has the possibility to change the data in the meantime but might forget to commit. The record is blocked with a select for update statement. So it can happen, that we mark something as transmitted, which is not.

This is an excerpt from our code:

Statement stmt = conn.createStatement(ResultSet.TYPE_SCROLL_SENSITIVE, ResultSet.CONCUR_UPDATABLE);
ResultSet extractedData = stmt.executeQuery(sql);
writeDataToFile(extractedData);
extractedData.beforeFirst();
while (extractedData.next()) {
    if (!extractedData.rowUpdated()) {
        extractedData.updateString("COLUMNNAME", "TRANSMITTED");
        // code will stop here if user has changed data but did not commit
        extractedData.updateRow();
        // once committed the changed data is marked as transmitted
    }
}

The method extractedData.rowUpdated() returns false, because technically the user didn't change anything yet. Is there any way to not update the row and detect if data was changed at this late stage?

Unfortunately I cannot change the program the user is using to change the data.


Solution

  • So you want to

    • Run through all rows of the table that have not been exported
    • Export this data somewhere
    • Mark these rows exported so your next iteration will not export them again
    • As there might be pending changes on a row, you don't want to mess with that information

    How about:

    You iterate over all rows. 
    
    for every row 
       generate a hash value for the contents of the row
       compare column "UPDATE_STATUS" with calulated hash
       if no match
         export row
         store hash into "UPDATE_STATUS" 
          if store fails (row locked) 
             -> no worries, will be exported again next time
          if store succeeds (on data already changed by user) 
             -> no worries, will be exported again as hash will not match
    

    This might further slow your export as you'll have to iterate over everything instead of over everything WHERE UPDATE_STATUS IS NULL but you might be able to do two jobs - one (fast) iterating over WHERE UPDATE_STATUS IS NULL and one slow and thorough WHERE UPDATE_STATUS IS NOT NULL (with the hash-rechecking in place)

    If you want to avoid store-failures/waits, you might want to store the hash /updated information into a second table copying the primary key plus the hash field value - that way user locks on the main table would not interfere with your updates at all (as those would be on another table)