I need to simplify a complex sql statement and I have come a long way, now I would like to remove a condition which seems superfluous to me: making sure that rows which already have the value are not updated.
Question: In general, will one of the two update statemens be faster than the other?
Option 1: (simpler)
UPDATE example_table SET example_column = 0
WHERE some_conditions
Option 2: (with an additional condition not to update elements which are already 0)
UPDATE example_table SET example_column = 0
WHERE some_conditions AND example_column <> 0
Subquestion: if option 2, why isn't the database optimization able to detect this?
Option 2 will be at least as fast as Option 1. Depending on the way your database is configured, Option 2 may be faster, especially when the number of rows that have zero represents a significant fraction of rows matching some_condition
, and your database defines update triggers with expensive logic on example_table
.
The reason the optimizer cannot optimize Option 1 into Option 2 is that their semantics are not identical. Although both options ultimately leave RDBMS in the same state, all updates must be reported to RDBMS triggers. For Option 1, this includes trivial updates, which are excluded for Option 2.