Search code examples
sqlpostgresqlnaming-conventionsfunction-parameter

PostgreSQL EXISTS in user-defined function always returning true


I wrote a simple user-defined function to check for the existence of rows matching some conditions:

CREATE OR REPLACE FUNCTION is_instructor_specialized_in(eid INT, course_area VARCHAR(50))
RETURNS BOOLEAN AS $$
  SELECT EXISTS(SELECT 1 FROM Specializes s WHERE s.eid = eid AND s.name = course_area);
$$ LANGUAGE sql;

I tested it with the following query:

SELECT is_instructor_specialized_in(2, 'Artificial Intelligence') as function_output, 
    EXISTS(SELECT 1 FROM Specializes s WHERE s.eid = 2 AND s.name = 'Artificial Intelligence') as ground_truth;

and the function gave a wrong value of true when it is supposed to evaluate to false (there is no such row in the Specializes table): image

In fact, it always gives the value of true. I'm super confused. Is there any reason why this is happening?

Version: PostgreSQL 13.2 on x86_64-apple-darwin19.6.0, compiled by Apple clang version 11.0.3 (clang-1103.0.32.62), 64-bit


Solution

  • Like @wildplasser hinted, your function parameter eidhas the same name as a table column, which is never a good idea. In this case it silently broke your function.

    The unqualified eid in WHERE s.eid = eid resolves to the table column, not to the function parameter, like you seem to expect. So this predicate evaluates to true for any notnull input. Sneaky error.

    The manual:

    If the argument name is the same as any column name in the current SQL command within the function, the column name will take precedence. To override this, qualify the argument name with the name of the function itself, that is function_name.argument_name. (If this would conflict with a qualified column name, again the column name wins. You can avoid the ambiguity by choosing a different alias for the table within the SQL command.)

    Bold emphasis mine.

    Qualifying with the function name is an awkward measure of last resort. Avoid the problem to begin with unambiguous parameter names. One convention is to prefix parameters with underscore (_) - and never do the same for table columns:

    CREATE OR REPLACE FUNCTION func_proper(_eid int, _course_area text)
      RETURNS boolean
      LANGUAGE sql STABLE PARALLEL SAFE AS
    $func$
    SELECT EXISTS(SELECT FROM specializes s WHERE s.eid = _eid AND s.name = _course_area);
    $func$;
    

    Or use positional $n parameter references for a simple case like yours. You can still have parameter names for documentation and for named function calls:

    CREATE OR REPLACE FUNCTION func_proper(_eid int, _course_area text)
      RETURNS boolean
      LANGUAGE sql STABLE PARALLEL SAFE AS
    $func$
    SELECT EXISTS(SELECT FROM specializes s WHERE s.eid = $1 AND s.name = $2);
    $func$;
    

    db<>fiddle here

    The default behavior for the same naming conflict in a PL/pgSQL function is to raise an exception, btw. See: