Search code examples
sqlnode.jspostgresqlrelational-divisionnode-postgres

Query unknown number of keywords in Postgres


I'm currently using postgres in node to query all users who have a certain tag associated with their account like so (note: I'm using node-postgres):

query = 'SELECT tags.*, pl.email FROM admin.tags tags
LEFT JOIN gameday.player_settings pl
ON tags.player_id = pl.id
WHERE tags.tag = $1'

client.query(
  query,
  [tagName],
  function(err, results) {
    ...

[tagName] is then passed into the WHERE clause.

What I'm aiming to do is instead query by an unknown number of tags and return all users who have all of those tags associated with their account. So instead of [tagName] I'd like to pass in an array of unknown length, [tagNames], but I'm not sure how to accomplish this.


Solution

  • You need to turn the question backwards. Instead of:

    Which users have all of these tags, you need to ask which users do not have one or more of these tags absent. It's a double negation.

    You also need a way to pass the set of tags. The best way to do this, if the client language binding supports it, is as an array-valued query parameter. If the client binding doesn't support array-valued parameters you'll need dynamic SQL.

    One formulation might be (untested, since you didn't provide sample schema and data):

    SELECT pl.email
    FROM gameday.player_settings pl
    WHERE NOT EXISTS (
      SELECT 1
      FROM unnest(?) AS wanted_tags(tag)
      LEFT JOIN admin.tags tags
        ON tags.tag = wanted_tags.tag
      WHERE  tags.player_id = pl.id
        AND wanted_tags.tag IS NULL
    );
    

    Doing a left join and filtering for IS NULL is called a left anti-join. It keeps the rows where the left-join condition does not match. So in this case, we retain a tag from our wanted_tags array only if there is no matching tag associated with this player. If any tags are left, the WHERE NOT EXISTS returns false, so the player is excluded.

    Double-think, isn't it? It's easy to make mistakes with this so test.

    Here ? should be your programming language PostgreSQL database binding's query parameter placeholder. I don't know what node.js's is. This will only work if you can pass an array as a query parameter in node. If not, you'll have to use dynamic SQL to generate an ARRAY['x','y','z'] expression or a (VALUES ('x'), ('y'), ('z')) subquery.

    P.S. Please provide sample schema and data with questions when possible. http://sqlfiddle.com/ is handy.