Search code examples
.netpostgresqlnpgsql

How to check existence of multiple values at once


I have a table in PostgreSQL with a unique index on a bytea column.

Currently, I am running this query to determine whether or not a value already exists on that column in the table.

-- mycolumn definition: mycolumn bytea unique not null
SELECT 1 FROM mytable WHERE mycolumn = @mycolumn;

When I get the result to a DataReader, I just check to see if it has rows and return true if so.

However, I need to check potentially hundreds of values in a batch, and this feels a bit slow to run per value due to the round trip overhead.

I've thought about just sending multiple statements:

SELECT 1 FROM mytable WHERE mycolumn = @mycolumn1;
SELECT 1 FROM mytable WHERE mycolumn = @mycolumn2;
SELECT 1 FROM mytable WHERE mycolumn = @mycolumn3;
-- etc

Here, I would see if reader has rows like before, and then call NextResult() to go to the next query result. Of course, I'll have to generate this query and add the parameters. Is this the fastest way to check existence of multiple values at a time in pgsql?

I also considered a scheme using EXISTS and UNIONing the queries together, but the documentation says that UNION does not preserve ordering. So then I would have to also provide an order number as another parameter in the query so I could reorder the results on the server. This seems more wasteful than above... if it works.

Update

I just found that according to the docs, returning multiple result sets is not supported in PG. So above won't work. So what is the best way?

The command string can include multiple SQL commands (separated by semicolons). [...] Note however that the returned PGresult structure describes only the result of the last command executed from the string.

Apparently the above only refers to functions.


Solution

  • You can use your original idea with Npgsql. It will do everything in one roundtrip. You will be able to get the results as well.