Search code examples
arrayspostgresqlplpgsql

Unnest array by one level


I want to take an array of n dimensions and return set containing rows of arrays of n-1 dimensions. For example, take the array ARRAY[[1,2,3], [4,5,6], [7,8,9]] and return a set {1,2,3}, {4,5,6}, {7,8,9}. Using unnest returns the set 1,2,3,4,5,6,7,8,9.

I tried grabbing the unnest function from PostgreSQL 8.4, which seems like it would do what I'm looking for:

CREATE OR REPLACE FUNCTION tstng.unnest2(anyarray)
  RETURNS SETOF anyelement
  LANGUAGE plpgsql IMMUTABLE AS
$$
BEGIN
    RETURN QUERY
    SELECT $1[i]
    FROM generate_series(array_lower($1,1), array_upper($1,1)) i;
END;
$$;

However, SELECT tstng.unnest2(ARRAY[[1,2,3], [4,5,6], [7,8,9]]); returns the set , , (i.e.: 3 null rows).

I've also found that SELECT (ARRAY[[1,2,3], [4,5,6], [7,8,9]])[0]; returns null, which I believe to be the root of my problem.


Solution

  • Function

    To break out 1-dimensional arrays from n-dimensional arrays - representing leaves of the nested dimensions.

    PL/pgSQL

    With a FOR loop looping through the array.
    Works for any array dimensions and any element type:

    CREATE OR REPLACE FUNCTION unnest_nd_1d(a anyarray, OUT a_1d anyarray)
      RETURNS SETOF anyarray
      LANGUAGE plpgsql IMMUTABLE PARALLEL SAFE STRICT AS
    $func$
    BEGIN                -- null is covered by STRICT
       IF a = '{}' THEN  -- empty
          a_1d = '{}';
          RETURN NEXT;
       ELSE              --  all other cases
          FOREACH a_1d SLICE 1 IN ARRAY a LOOP
             RETURN NEXT;
          END LOOP;
       END IF;
    END
    $func$;
    

    fiddle

    SLICE 1 instructs to take 1-dimensonal arrays. (SLICE 2 would take 2-dimensional arrays.)

    This returns null on null input (because of the STRICT modifier) and an empty array on empty input (caught by IF). We need to special-case the empty array '{}' because, quoting the manual:

    The SLICE value must be an integer constant not larger than the number of dimensions of the array.

    PARALLEL SAFE only for Postgres 9.6 or later.

    Later tests revealed this PL/pgSQL function to be fastest.
    Related:

    Pure SQL

    Only works for 2D arrays (of any element type):

    CREATE OR REPLACE FUNCTION unnest_2d_1d(anyarray)
      RETURNS SETOF anyarray
      LANGUAGE sql IMMUTABLE PARALLEL SAFE STRICT AS
    $func$
    SELECT array_agg($1[d1][d2])
    FROM   generate_subscripts($1,1) d1
        ,  generate_subscripts($1,2) d2
    GROUP  BY d1
    ORDER  BY d1
    $func$;
    

    This is an improved and simplified version of the function Lukas posted.

    fiddle
    Old sqlfiddle

    Explanation

    SELECT (ARRAY[[1,2,3], [4,5,6], [7,8,9]])[0]
    

    returns the same as:

    SELECT (ARRAY[[1,2,3], [4,5,6], [7,8,9]])[17]
    

    ... which is NULL. The manual:

    By default, the lower bound index value of an array's dimensions is set to one.

    0 has no special meaning as array subscript. There's just nothing there for Postgres arrays with default array indexes.
    Also, with two-dimensional arrays, you need two indexes to get a base element. Like:

    SELECT (ARRAY[[1,2,3], [4,5,6], [7,8,9]])[1][2]
    

    Result:

    2
    

    The first part of your message is a bit unclear.

    SELECT array_dims(ARRAY[[1,2,3], [4,5,6], [7,8,9]]);
    

    Result:

    [1:3][1:3]
    

    That's two dimensions with 3 elements (1 to 3) each (9 base elements).
    If you want n-1 dimensions then this is a correct result:

    SELECT ARRAY (SELECT unnest('{{1,2,3}, {4,5,6}, {7,8,9}}'::int[]))
    

    Result:

    {1,2,3,4,5,6,7,8,9}
    

    That's one dimension. unnest() produces one base element per row (regardless of array dimensions). Your example is just another 2-dimensional array with a missing set of curly brackets ... ?

    {1,2,3}, {4,5,6}, {7,8,9}
    

    If you want a slice of the array:

    SELECT (ARRAY[[1,2,3], [4,5,6], [7,8,9]])[1:2]
    

    Result:

    {{1,2,3},{4,5,6}}
    

    Or:

    SELECT (ARRAY[[1,2,3], [4,5,6], [7,8,9]])[2:2][1:2]
    

    Result:

    {{4,5}}
    

    To flatten the result (get a 1D array):

    Read the manual here.

    Postgres 8.4-

    In ancient versions of Postgres, array_agg() is not installed, yet. Create it first:

    CREATE AGGREGATE array_agg(anyelement) (
     SFUNC = array_append,
     STYPE = anyarray,
     INITCOND = '{}'
    );
    

    Also, generate_subscripts() is not born, yet. Use instead:

    ...
    FROM   generate_series(array_lower($1,1), array_upper($1,1)) d1
        ,  generate_series(array_lower($1,2), array_upper($1,2)) d2
    ...
    

    Call:

    SELECT unnest_2d_1d(ARRAY[[1,2], [3,4], [5,6]]);
    

    Result

    {1,2}
    {3,4}
    {5,6}