Search code examples
sqlpostgresql

SQL query for returning rows with only certain values in list but not others?


I have a table that looks something like this:

id colors other column other column
1 red, green, blue ... ...
2 red, blue ... ...
3 red ... ...
4 white, blue, purple ... ...
5 green, white, purple ... ...

I need a SQL query that would return the rows where the values match only those in my filter list, without returning rows which have any other values in the colors column.

The query I'm working from is this:

select * from table where colors in ('red', 'blue');

The problem with this one is that it will return rows 1-4 in the result, since each row has either red or blue in its colors. Instead, I want to return only rows 2 and 3, which are the only rows which don't have any extra colors outside of my filter list.

There could be any number of colors in the column, so I don't have a good list to use as a NOT filter unfortunately. I need to be able to return the entire row from the DB, including all available columns.

Appreciate any help or suggestions.

TIA!


Solution

  • As other pointed out in the comment section this is really a bad design and will cause more pain in the future, fix it if possible.

    See Is storing a delimited list in a database column really that bad?

    Regarding the question,

    First split the comma separated string

    SELECT 
      id,
      TRIM(UNNEST(STRING_TO_ARRAY(colors, ','))) AS colors
    FROM my_table
    

    Result

    id  colors
    1   red
    1   green
    1   blue
    2   red
    2   blue
    3   red
    ......
    

    Then filter the results

    with cte as  (
            SELECT id,
                   TRIM(UNNEST(STRING_TO_ARRAY(colors, ','))) AS colors
            FROM my_table
    ) select id
      from cte 
    group by id
    having count (distinct colors) = 2 
      and min(colors) = 'blue'
      and max(colors) = 'red';
    

    Result

    id
    2
    

    If you need all the columns use,

    with split_string as  (
            SELECT id,
                   TRIM(UNNEST(STRING_TO_ARRAY(colors, ','))) AS colors
            FROM my_table
    ), ids as (
                 select id
                 from split_string 
                 group by id
                 having count (distinct colors) = 2 
                 and min(colors) = 'blue'
                 and max(colors) = 'red'
      ) select m.*
        from my_table m 
        inner join ids i on i.id=m.id;
    

    Result

    id  colors
    2   red, blue
    

    See example here


    Edit

    I want to return only rows 2 and 3, which are the only rows which don't have any extra colors outside of my filter list

    WITH cte AS  (
            SELECT id,
                   TRIM(UNNEST(STRING_TO_ARRAY(colors, ','))) AS colors
            FROM my_table
    ) SELECT DISTINCT id
      FROM cte 
      WHERE colors IN ('blue','red')
      AND id NOT IN (SELECT id 
                      FROM cte
                      WHERE colors NOT IN ('blue','red')
                      );
    

    See example