Search code examples
sqlpostgresqlselectconcatenationdynamic-sql

Select concatenated columns based on criteria list in other table


I have a table1

line a b c d e f g h
1 18 2 2 22 0 2 1 2
2 20 2 2 2 0 0 0 2
3 10 2 2 222 0 2 1 2
4 12 2 2 3 0 0 0 0
5 15 2 2 3 0 0 0 0

And a table2

 line criteria
1  a,b
2  b,c,f,h
3  a,b,e,g,h
4  c,e

I am using this code to see/select the unique results of concated/joined columns, like concat(c,',',d), concat(b,',',d,',',g) and so on from table1 and is working perfectly:

SELECT DISTINCT(CONCAT(c,',',d)) 
FROM table1

But, instead of writing manually like concat(c,',',d), I want to refer to table2.criteria to get columns references to be concated/joined from table1 so that i can see the entire unique results against each concated criteria

Tried this, but getting an error:

SELECT DISTINCT(SELECT criteria FROM table2) 
FROM table1

ERROR: more than one row returned by a subquery used as an expression
SQL state: 21000

The expected unique result is something like this;

| criteria     | result     |
| ------------ | ---------- |
| a,b          | 15,2       |
| a,b          | 10,2       |
| a,b          | 20,2       |
| a,b          | 12,2       |
| a,b          | 18,2       |
| b,c,f,h      | 2,2,2,2    |
| b,c,f,h      | 2,2,0,2    |
| b,c,f,h      | 2,2,0,0    |
| a,b,e,g,h    | 20,2,0,0,2 |
| a,b,e,g,h    | 12,2,0,0,0 |
| a,b,e,g,h    | 15,2,0,0,0 |
| a,b,e,g,h    | 10,2,0,1,2 |
| a,b,e,g,h    | 18,2,0,1,2 |
| c,e          | 2,0        |

Solution

  • SQL does not allow to parameterize identifiers. There are various ways to work around this restriction.

    It's unclear from the question, but according to comments you want to concatenate the given pattern for every row in table1.

    1. Dynamic SQL

    Create a helper function (once!) that concatenates and executes statements dynamically.
    Basics:

    CREATE OR REPLACE FUNCTION f_concat_cols(_cols text)
      RETURNS TABLE (result text)
      LANGUAGE plpgsql AS
    $func$
    BEGIN
       RETURN QUERY EXECUTE format(
         $q$SELECT concat_ws(',', %s) FROM table1 ORDER BY line$q$, _cols);
    END
    $func$;
    

    It's a set-returning function (a.k.a. "table function"), to return one result row for every row in table1 for each given pattern.

    Warning: Converting user input to code like this is a prime opportunity for SQL injection. You must make sure that table1.criteria can only hold valid strings!

    To get the full result matrix (with distinct results per row in table2), the query is simple now:

    SELECT DISTINCT line AS t2_line, criteria, t1.*
    FROM   table2, f_concat_cols(criteria) t1
    ORDER  BY t2_line;
    

    2. Workaround with conversion to JSON

    SELECT DISTINCT t2.line AS t2_line, t2.criteria, c.*
    FROM   table2 t2
    CROSS  JOIN (SELECT line, to_json(t) AS js FROM table1 t) t1
    CROSS  JOIN LATERAL (
       SELECT string_agg(t1.js->>sub, ',') AS result
       FROM   string_to_table(t2.criteria, ',') sub
       ) c
    ORDER  BY t2_line;
    

    After converting rows from t1 to a JSON record, we can access keys (converted from column names) directly.
    I unnest the pattern with string_to_table() (Postgres 14+), access each single key, and aggregate the result in a LATERAL subquery. See:

    You could encapsulate the logic in a function like in 1., but that's optional in this case.

    3. Workaround with conversion to Postgres arrays

    SELECT DISTINCT t2.line AS t2_line, t2.criteria, c.*
    FROM   table2 t2
    CROSS  JOIN (SELECT line, ARRAY [a,b,c,d,e,f,g,h] AS arr FROM table1 t) t1
    CROSS  JOIN LATERAL (
       SELECT string_agg(t1.arr[idx::int]::text, ',') AS result
       FROM   string_to_table(translate(t2.criteria, 'abcdefgh', '12345678'), ',') idx
       ) c
    ORDER  BY t2_line;
    

    Similar to the "trick" with JSON, we can avoid dynamic SQL by converting columns to a plain Postgres array. Then project column names to integer array indices. I use translate() for the simple case, but that only works for single letters! Use replace() or regexp_replace() or some other method for longer names.
    The rest is like the above.

    fiddle - showing all.