Search code examples
sqlcsvsubqueryamazon-redshiftsql-like

Matching values in a comma seprated field in Redshift


I have 2 tables in Redshift as follows:

T1:

QID   EID
Q1    a.1,a.2
Q2    a.5,a.4,a.8
Q3    a.7
Q4    a.3,a.5

and

T2:

EIDs
a.1
a.2
a.4
a.7

I would like to select only the records from T1 that contain any value available in T2.EIDs. So if any of the values present in T2.EIDs is available in the comma seprated field T1.EID, then I want to select that record from T1. For the given example, here's the resultset I want:

QID   EID
Q1    a.1,a.2
Q2    a.5,a.4,a.8
Q3    a.7

I know the LIKE and SIMILAR TO clauses would be at play here, but I can't seem to come up with something that would make it work.

The final query could look something like:

SELECT T1.*
FROM T1 JOIN T2 ON <cannot think of a proper join condition>

Any help would be much appreciated.


Solution

  • That's a poor data model. Each and every value in the CSV list should be stored in a different row rather than concatenated in a string. Then query would be easy to write and efficient.

    For your current design, you can use string functions like this:

    select t1.*
    fom t1
    where exists (select 1 from t2 where ',' || t1.eid || ',' like '%,' || t2.eid || ',%')