Search code examples
cpostgresqllibpq

How to handle a numeric data type from PostgreSQL in C?


I try to convert data queried via libpq in binary format into Arrow format in C. For that, I queried the data type Oids for the corresponding columns via PQftype() and match them with Arrow datatypes. But I am not sure how to handle numerics.

The query SELECT oid, typname FROM pg_type; returns 1700 for numeric, but how do I get precision and scale?

I am not able to find helpful information in the documentation. Am I searching at the wrong spot? https://www.postgresql.org/docs/13/datatype-numeric.html#DATATYPE-NUMERIC-DECIMAL

What can I expect to get in my PGresult from a binary numeric?

Thanks in advance :)

EDIT

So with the help of Laurenz I managed to put a function together, which converts a Numeric in binary form from libpq into a basic String representation. Of course binary results only make sense, if you do not intend to convert the Numerics to Strings, but it helps to understand the format of Numerics.

The binary form basically is a list of 2 byte integers with different meanings and the digit integers are just concatenated in their String form. Here is 49273.64 with precision 20 and scale 2 in binary and in fields:

ndigits | 00000000 | 00000011 | weight | 00000000 | 00000001 | sign | 00000000 | 00000000 | dscale | 00000000 | 00000010 | digits | 00000000 | 00000100 | 00100100 | 00111001 | 00011001 | 00000000 
ndigits: 3, weight: 1, sign: 0, dscale: 2, digits: 4 | 9273 | 6400
char *getStrFromNumeric(u_int16_t *numvar){
    u_int16_t ndigits = ntohs(numvar[0]); // how many u_int16_t at numvar[4]
    int16_t dscale = ntohs(numvar[3]); // how many char digits after decimal point
    int16_t weight = ntohs(numvar[1])+1; // weight+1 is how many u_int16_t from numvar[4] are before decimal point. here weight already gets +1 at initialization.
    char *result = (char *)malloc(sizeof(char)*(weight+dscale)+1+1+2); // +1+1 -> '\0' and '.'
    char *copyStr = (char *) malloc(sizeof (char)*5);
    int strindex = 0;
    int numvarindex = 0;
    while(weight>0){
        sprintf(copyStr, "%d", ntohs(numvar[numvarindex+4]));
        sprintf(&(result[strindex]), "%s", copyStr);
        strindex += strlen(copyStr);
        numvarindex++;
        weight--;
    }
    sprintf(&(result[strindex]), ".");
    strindex++;
    while(dscale>0){
        sprintf(copyStr, "%d", ntohs(numvar[numvarindex+4]));
        dscale -= strlen(copyStr);
        sprintf(&(result[strindex]), "%s", copyStr);
        strindex += strlen(copyStr);
        numvarindex++;
    }
    sprintf(&(result[strindex]), "\0");
    return result;
}

Solution

  • You can find the implementation details in src/backend/utils/adt/numeric.c. Scale and precision of the numeric are not stored in pg_type, because they are not part of the data type. The relevant attribute is atttypmod in pg_attribute, because scale and precision are part of the column definition.

    You can glean the way scale and precision are encoded in the type modifier for example from this function:

    Datum
    numerictypmodout(PG_FUNCTION_ARGS)
    {
        int32       typmod = PG_GETARG_INT32(0);
        char       *res = (char *) palloc(64);
    
        if (typmod >= 0)
            snprintf(res, 64, "(%d,%d)",
                     ((typmod - VARHDRSZ) >> 16) & 0xffff,
                     (typmod - VARHDRSZ) & 0xffff);
        else
            *res = '\0';
    
        PG_RETURN_CSTRING(res);
    }
    

    So to get precision and scale of column n of table t, you could run

    SELECT (atttypmod - 4) >> 16 & 65535 AS precision,
           (atttypmod - 4) & 65535 AS scale
    FROM pg_attribute
    WHERE attrelid = 't'::regclass
      AND attname = 'n';
    

    The binary format for numeric is defined in the function numeric_send:

    Datum
    numeric_send(PG_FUNCTION_ARGS)
    {
        Numeric     num = PG_GETARG_NUMERIC(0);
        NumericVar  x;
        StringInfoData buf;
        int         i;
    
        init_var_from_num(num, &x);
    
        pq_begintypsend(&buf);
    
        pq_sendint16(&buf, x.ndigits);
        pq_sendint16(&buf, x.weight);
        pq_sendint16(&buf, x.sign);
        pq_sendint16(&buf, x.dscale);
        for (i = 0; i < x.ndigits; i++)
            pq_sendint16(&buf, x.digits[i]);
    
        PG_RETURN_BYTEA_P(pq_endtypsend(&buf));
    }
    

    The individual parts are described in the documentation for NumericVar:

    /* ----------
     * NumericVar is the format we use for arithmetic.  The digit-array part
     * is the same as the NumericData storage format, but the header is more
     * complex.
     *
     * The value represented by a NumericVar is determined by the sign, weight,
     * ndigits, and digits[] array.  If it is a "special" value (NaN or Inf)
     * then only the sign field matters; ndigits should be zero, and the weight
     * and dscale fields are ignored.
     *
     * Note: the first digit of a NumericVar's value is assumed to be multiplied
     * by NBASE ** weight.  Another way to say it is that there are weight+1
     * digits before the decimal point.  It is possible to have weight < 0.
     *
     * buf points at the physical start of the palloc'd digit buffer for the
     * NumericVar.  digits points at the first digit in actual use (the one
     * with the specified weight).  We normally leave an unused digit or two
     * (preset to zeroes) between buf and digits, so that there is room to store
     * a carry out of the top digit without reallocating space.  We just need to
     * decrement digits (and increment weight) to make room for the carry digit.
     * (There is no such extra space in a numeric value stored in the database,
     * only in a NumericVar in memory.)
     *
     * If buf is NULL then the digit buffer isn't actually palloc'd and should
     * not be freed --- see the constants below for an example.
     *
     * dscale, or display scale, is the nominal precision expressed as number
     * of digits after the decimal point (it must always be >= 0 at present).
     * dscale may be more than the number of physically stored fractional digits,
     * implying that we have suppressed storage of significant trailing zeroes.
     * It should never be less than the number of stored digits, since that would
     * imply hiding digits that are present.  NOTE that dscale is always expressed
     * in *decimal* digits, and so it may correspond to a fractional number of
     * base-NBASE digits --- divide by DEC_DIGITS to convert to NBASE digits.
     *
     * rscale, or result scale, is the target precision for a computation.
     * Like dscale it is expressed as number of *decimal* digits after the decimal
     * point, and is always >= 0 at present.
     * Note that rscale is not stored in variables --- it's figured on-the-fly
     * from the dscales of the inputs.
     *
     * While we consistently use "weight" to refer to the base-NBASE weight of
     * a numeric value, it is convenient in some scale-related calculations to
     * make use of the base-10 weight (ie, the approximate log10 of the value).
     * To avoid confusion, such a decimal-units weight is called a "dweight".
     *
     * NB: All the variable-level functions are written in a style that makes it
     * possible to give one and the same variable as argument and destination.
     * This is feasible because the digit buffer is separate from the variable.
     * ----------
     */
    

    For convenient handling of the binary representation of numeric, you should use the libpgtypes library.