Search code examples
postgresqlforeign-data-wrappermulticorn

FDW in Postgres: Batching ids for an external request?


I'm developing a collection of foreign data wrappers using multicorn and I've run into an issue with batching data.

So, I have two foreign tables, search and data, that are each backed by a foreign data wrapper that I'm writing.

I need to do a basic join on these tables:

SELECT data.*
FROM search, data
WHERE search.data_id = data.id
AND search.term = 'search for this pls'

This works, but there's a hitch in the data fdw being able to batch queries to the server. If the search table returns 5 ids for a given search, then the data fdw is executed once for each of those ids. The API backing the data fdw is capable of processing many ids in one request.

The following works:

SELECT data.*
FROM data
WHERE id in ('2244', '31895')

In this case the data fdw receives an array of both ids and is able to perform one request.

Is there any way to make the join work where the data fdw has the opportunity to batch ids for a request?

Thanks!


Solution

  • You should look at the EXPLAIN output for your query, and then you'll probably see that PostgreSQL is performing a nested loop join, i.e. it scans search for the matching rows, and for each result row scans data for matching rows.

    PostgreSQL has other join strategies like hash joins, but for that it would have to read the whole data table, which is probably not a win. You might want to try it by setting enable_nestloopto off and testing query performance. If that is an improvement, you might want to adjust the cost values for the foreign table scan on data to reflect the high “startup costs” so that the planner becomes more reluctant to choose a nested loop join.

    There is no such join strategy as you propose – while it may well be a win for FDW joins, it does not offer advantages in regular joins. So if the join strategy you envision is really the optimal one, you'd have to first fetch the data_ids from search, construct a query for data and implement the join in the application.