The following query (sourced from here Postgres SQL SELECT and UPDATE behaving differently)
update fromemailaddress
set call = true
from email
where email.fromemailaddress = fromemailaddress.fromemailaddress and
LOWER(email.subject) ~ 'tester';
The way I read it is:
Line 1: update fromemailaddress
-- we tell the database that we are updating the fromemailaddress table
Line 2: set call = true
-- we tell the database that the field named "call" will be set to true
Line 3: from email
Line 4: where email.fromemailaddress = fromemailaddress.fromemailaddress and
Line 5: LOWER(email.subject) ~ 'tester';
-- OK now things are getting fuzzy. What actually happens here? It appears that somehow the database takes the result of the query in lines 3 4 and 5, but how does that tell it which rows to update in the fromemailaddress table? What is the pseudocode? Is it something like:
for each row in (query from lines 3, 4, 5)
set call=true?
I just can't see how the FROM side of the SQL update relates to the other side.
UPDATE:
Following the valuable links in @Erwin's answer below lead me to this information that gets to the core of what I was trying to understand:
http://www.postgresql.org/docs/current/interactive/sql-update.html
When a FROM clause is present, what essentially happens is that the target table is joined to the tables mentioned in the from_list, and each output row of the join represents an update operation for the target table. When using FROM you should ensure that the join produces at most one output row for each row to be modified. In other words, a target row shouldn't join to more than one row from the other table(s). If it does, then only one of the join rows will be used to update the target row, but which one will be used is not readily predictable.
Because of this indeterminacy, referencing other tables only within sub-selects is safer, though often harder to read and slower than using a join.
The UPDATE
query you display is exactly the same as:
UPDATE fromemailaddress f
SET call = true
FROM (
SELECT fromemailaddress
FROM email
WHERE subject ILIKE '%tester%'
) e
WHERE e.fromemailaddress = f.fromemailaddress;
subject ILIKE '%tester%'
is a faster equivalent for subject ~ 'tester'
. Details for LIKE
, ILIKE
and regular expression matching (~
) in the manual or in this related answer on dba.SE:
And effectively the same as:
UPDATE fromemailaddress f
SET call = true
WHERE EXISTS (
SELECT FROM email e
WHERE e.fromemailaddress = f.fromemailaddress
AND e.subject ILIKE '%tester%'
);
Use this instead.
If there should be multiple rows in table email
with the same fromemailaddress
matching a row in fromemailaddress
, then this form only executes one update per row, unlike your unfortunate original.
Don't be confused by the fact that fromemailaddress
is used as column and as table name here.
Read the manual carefully here and here. In particular this bit:
from_list
A list of table expressions, allowing columns from other tables to appear in the
WHERE
condition and the update expressions. This is similar to the list of tables that can be specified in theFROM
Clause of aSELECT
statement. Note that the target table must not appear in the from_list, unless you intend a self-join (in which case it must appear with an alias in the from_list).