Search code examples
postgresqlparameter-passingnaming-conventionsplpgsqlquotes

Using variables in a PL/pgSQL function


Postgres PL/pgSQL docs say:

For any SQL command that does not return rows, for example INSERT without a RETURNING clause, you can execute the command within a PL/pgSQL function just by writing the command.

Any PL/pgSQL variable name appearing in the command text is treated as a parameter, and then the current value of the variable is provided as the parameter value at run time.

But when I use variable names in my queries I get an error:

ERROR:  syntax error at or near "email"
LINE 16: ...d,email,password) values(identity_id,current_ts,''email'',''...

This is my function:

CREATE OR REPLACE FUNCTION app.create_identity(email varchar,passwd varchar)
RETURNS integer as $$
DECLARE
    current_ts          integer;
    new_identity_id     integer;
    int_max             integer;
    int_min             integer;
BEGIN
    SELECT extract(epoch FROM now())::integer INTO current_ts;
    int_min:=-2147483648;
    int_max:= 2147483647;
    LOOP
        BEGIN
            SELECT floor(int_min + (int_max - int_min + 1) * random()) INTO new_identity_id;
            IF new_identity_id != 0 THEN
                INSERT into app.identity(identity_id,date_inserted,email,password) values(identity_id,current_ts,''email'',''passwd'');
                RETURN new_identity_id;
            END IF;
        EXCEPTION
            WHEN unique_violation THEN
        END;
    END LOOP;
END;
$$ LANGUAGE plpgsql;

Why when I use variables in the query, Postgres throws an error. How is this supposed to be written?


Solution

  • About quoting:

    About naming conflicts:

    Better solution

    I suggest a completely different approach to begin with:

    CREATE OR REPLACE FUNCTION app.create_identity(_email text, _passwd text
                                                 , OUT new_identity_id int)
      LANGUAGE plpgsql AS
    $func$
    /* + Generate completely random int4 numbers +
    -- integer (= int4) in Postgres is a signed integer occupying 4 bytes
    -- int4 ranges from -2147483648 to +2147483647, i.e. -2^31 to 2^31 - 1
    -- Multiply bigint 4294967296 (= 2^32) with random() (0.0 <= x < 1.0)
    --   trunc() the resulting (positive!) float8 - cheaper than floor()
    --   add result to -2147483648 and cast the next result back to int4
    -- The result fits the int4 range *exactly*
    */
    DECLARE
       _current_ts int := extract(epoch FROM now());
    BEGIN
       LOOP
          INSERT INTO app.identity
                (identity_id, date_inserted,  email ,  password)
          SELECT _random_int, _current_ts  , _email , _passwd
          FROM  (SELECT (bigint '-2147483648'       -- could be int, but sum is bigint anyway
                       + bigint '4294967296' * random())::int) AS t(_random_int)  -- random int
          WHERE  _random_int <> 0                   -- exclude 0 (no insert)
          ON     CONFLICT (identity_id) DO NOTHING  -- no exception raised!
          RETURNING identity_id                     -- return *actually* inserted identity_id
          INTO   new_identity_id;                   -- OUT parameter, returned at end
    
          EXIT WHEN FOUND;                          -- exit after success
          -- maybe add counter and raise warning/exception when exceeding 5/10 (?) iterations
       END LOOP;
    END
    $func$;
    

    Major points

    • Your random integer calculation would result in an integer out of range error, because the intermediate term int_max - int_min + 1 operates with integer, but the result won't fit. The above is cheaper and correct.

    • Entering a block with exception clause is considerably more expensive than without. Luckily, you do not actually need to raise an exception to begin with. Use an UPSERT (INSERT ... ON CONFLICT ... DO NOTHING), to solve this cheaply and elegantly (Postgres 9.5+).
      The manual:

      Tip: A block containing an EXCEPTION clause is significantly more expensive to enter and exit than a block without one. Therefore, don't use EXCEPTION without need.

    • You don't need an extra IF either. Use SELECT with WHERE.

    • Make new_identity_id an OUT parameter to simplify.

    • Use the RETURNING clause and insert the resulting identity_id into the OUT parameter directly. Simpler and faster. And there is an additional, subtle benefit: you get the value that was actually inserted. If there are triggers or rules on the table, this might differ from the input.

    • Assignments are relatively expensive in PL/pgSQL. Reduce those to a minimum for efficient code.
      You might remove the last remaining variable _current_ts as well, and do the calculation in the subquery, then you don't need a DECLARE at all. I left that one, since it might make sense to calculate it once, should the function loop multiple times ...

    • All that remains is one SQL command, wrapped into a LOOP to retry until success.

    • If there is a chance that your table might overflow (using most int4 numbers) - and strictly speaking, there is always the chance - I would add a counter and raise an exception after, say, 10 iterations to avoid endless loops. And a warning already after half the attempts.