Why in the following code does Vertica return an error that zip3
ambiguous? Since it is used as the key for the left join, it must be equal for both tables.
SELECT zip, zip3, city, zip3_name
FROM zip_codes
LEFT JOIN zip3_codes
USING (zip3);
This generates the following error:
Query 1 ERROR: ERROR: Column reference "zip3" is ambiguous
DETAIL: Reference "zip3" could refer to either "public.zip_codes.zip3" or "public.zip3_codes.zip3"
I can resolve this by selecting either zip.zip3
or zip3.zip3
or COALESCE(zip.zip3, zip3.zip3)
but I'm curious why this would be necessary.
While at first sight it may seem that the columns must always be the same, it's not really the case. The JOIN
is actually a LEFT JOIN
, which returns all the rows in the first table and all possible matches from the second, giving NULL
s when there is no match.
Unmatched records from zip_codes
will result in different values in zip.zip3
and zip3.zip3
, exactly the first giving a value and the second being NULL
, so it's not really equivalent to exchange those in this particular case, you'll need to determine which columns from which table you need and disambiguate with its table name.
Besides, even if that's weren't the case, the error would still popup. If we change your LEFT JOIN
to an INNER JOIN
, your assumption would be true and the columns will always have the same value, yet you would still get the very same error.
The reason for this is that the SQL analyzer is not clever enough to realize the values are always the same. By the standard, if your query references columns with the same name from different tables you must specify from which table it is. No further attempt to resolve the ambiguity is done.