Search code examples
sqlvertica

Why does SQL complain that a reference to a join key is ambiguous?


Why in the following code does Vertica return an error that zip3 ambiguous? Since it is used as the key for the left join, it must be equal for both tables.

SELECT zip, zip3, city, zip3_name
FROM zip_codes
LEFT JOIN zip3_codes
USING (zip3);

This generates the following error:

Query 1 ERROR: ERROR:  Column reference "zip3" is ambiguous
DETAIL:  Reference "zip3" could refer to either "public.zip_codes.zip3" or "public.zip3_codes.zip3"

I can resolve this by selecting either zip.zip3 or zip3.zip3 or COALESCE(zip.zip3, zip3.zip3) but I'm curious why this would be necessary.


Solution

  • While at first sight it may seem that the columns must always be the same, it's not really the case. The JOIN is actually a LEFT JOIN, which returns all the rows in the first table and all possible matches from the second, giving NULLs when there is no match.

    Unmatched records from zip_codes will result in different values in zip.zip3 and zip3.zip3, exactly the first giving a value and the second being NULL, so it's not really equivalent to exchange those in this particular case, you'll need to determine which columns from which table you need and disambiguate with its table name.

    Besides, even if that's weren't the case, the error would still popup. If we change your LEFT JOIN to an INNER JOIN, your assumption would be true and the columns will always have the same value, yet you would still get the very same error.

    The reason for this is that the SQL analyzer is not clever enough to realize the values are always the same. By the standard, if your query references columns with the same name from different tables you must specify from which table it is. No further attempt to resolve the ambiguity is done.