Search code examples
postgresql

Why does an operation on real and a numeric inputs result in a double precision?


Performing a mathematical operation on a real operand and a numeric operand in PostgreSQL results in a double precision value, but I find this surprising. Why wouldn't the result be a real, instead?

I confirmed this behavior with PostgreSQL 13 and 16.

SELECT 
   pg_typeof(0::numeric + 0::numeric), -- numeric
   pg_typeof(0::real    + 0::real),    -- real
   pg_typeof(0::numeric + 0::real),    -- double precision??
   pg_typeof(0::real    + 0::numeric); -- double precision??

I have read substantial parts of the PostgreSQL documentation, most relevantly: 10.2. Type Conversion: Operators

And I have queried pg_catalog.pg_cast (or psql's \dC) to understand what the relevant implicit casts are:

> SET search_path = 'pg_catalog';
> SELECT format_type(castsource, NULL) AS source,
         format_type(casttarget, NULL) AS target
  FROM pg_cast
  WHERE castcontext = 'i' -- i.e. casts allowed implicitly
    AND ARRAY[format_type(castsource, NULL), format_type(casttarget, NULL)] 
        <@ '{numeric,real,double precision}'::text[];

The result is:

 source  |      target
---------+------------------
 real    | double precision
 numeric | real
 numeric | double precision
 numeric | numeric
(4 rows)

So as I understand it, for 0::numeric + 0::real, PostgreSQL should (excerpts are all from 10.2. Type Conversion: Operators):

  1. Select the operators to be considered from the pg_operator system catalog. If a non-schema-qualified operator name was used (the usual case), the operators considered are those with the matching name and argument count that are visible in the current search path.

I expect it to start with all + operators - each one accepts a pair of the same numeric types (real + real, integer + integer, etc.)

  1. Check for an operator accepting exactly the input argument types.

In step 2, no operators are an exact match for real and numeric, so I expect that nothing happens here.

3a. Discard candidate operators for which the input types do not match and cannot be converted (using an implicit conversion) to match.

I expect that this should discard most of the candidates. It should keep exactly two: the one for real, and the one for double precision that it apparently ends up using. (Note that real cannot be implicitly cast to numeric, so it should not keep the one for numeric.)

3b. If any input argument is of a domain type, treat it as being of the domain's base type for all subsequent steps.

I expect this does nothing, as there are no domain types here.

3c. Run through all candidates and keep those with the most exact matches on input types. Keep all candidates if none have exact matches. If only one candidate remains, use it; else continue to the next step.

Here, I expect it should discard the operator for double precision, because the operator for real has one exact match, while the one for double precision has zero exact matches. And then since there is only one candidate left here, it should use the operator for real, and I expect the result to be a real. But instead, the result is a double precision value.

So, my question is why is PostgreSQL choosing the double precision operator in this case, where it needs to implicitly convert both operands instead of just one of them? Is the documentation wrong, or my understanding of it wrong?


Solution

  • The two missing pieces:

    1. The doc on result types in mixed-type mathematical operations:

      Calls involving multiple argument data types, such as integer + numeric, are resolved by using the type appearing later in these lists.

      The list in question being "smallint, integer, bigint, numeric, real, and double precision". That's the general rule, but it's a consequence of what operators are built-in, discussed below.

    2. Type preference. You can check it in pg_type.ispreferred. There, double precision shows up as the preferred type over both real as well as over numeric, in the Numeric types group:

      select typname
           , typispreferred
      from pg_type 
      where typcategory='N'--N for Numeric types
      order by typispreferred desc
              ,typname
      
      typname typispreferred
      float8
      (double precision)
      True
      oid True
      float4
      (real)
      false
      int2
      (smallint)
      false
      int4
      (int)
      false
      int8
      (bigint)
      false
      money false
      numeric false

      You stopped in 10.2. Type Conversion: Operators right before reaching the step it uses that:

      3d. Run through all candidates and keep those that accept preferred types (of the input data type's type category)

      It does go to 3d, because in 3c it finds there is no real+numeric operator (no exact matches, three half-matches):

      select oprleft::regtype
           , oprright::regtype
           , oprresult::regtype
      from pg_operator
      where '{real, double precision,numeric}' 
          && array[oprleft,oprright]::regtype[]
        and oprname='+';
      
      oprleft oprright oprresult
      real real real
      double precision double precision double precision
      real double precision double precision
      double precision real double precision
      numeric numeric numeric
      - real real
      - double precision double precision
      - numeric numeric
      pg_lsn numeric pg_lsn
      numeric pg_lsn pg_lsn

      And out of the three closest candidates, real+real (cast right) real+double precision (cast right) and numeric+numeric (cast left), the middle one wins, thanks to its destination type being the preferred one.

    The first one is just to make this answer more generic and apply to other combinations. The second one is precisely why you see real+numeric result in real+numeric::double precision that yields a double precision, rather than a real+numeric::real that would get you a real.

    why is PostgreSQL choosing the double precision operator in this case, where it needs to implicitly convert both

    It only converts one, the numeric. Your test used constants that get constant-folded leaving just the pre-evaluated result and hiding the underlying operations. You can see that more clearly if you use explain verbose and a table with different values:
    demo at db<>fiddle

    EXPLAIN VERBOSE
    SELECT pg_typeof(n+n) nn
         , pg_typeof(r+r) rr
         , pg_typeof(n+r) nr
         , pg_typeof(r+n) rn
    FROM test;
    
    Seq Scan on public.test (cost=0.00..51.23 rows=970 widt=20)
    Output: pg_typeof((n + n))
                 , pg_typeof((r + r))
                 , pg_typeof(((n)::double precision + r))
                 , pg_typeof((r + (n)::double precision))