Search code examples
pg-promise

Why should we use batch() instead of Promise.all?


From the pg-promise FAQ Why use method batch instead of promise.all?:

It is quintessential to settle all the promises-queries created within your task or transaction, before the connection is released

I don't see why this should be a problem.

For example when we have an array of queries like this:

  [
    t.any("SELECT pg_sleep(2) as a"),
    t.any('this will fail'),
    t.any("SELECT pg_sleep(3) as b")
  ]

Note: pg_sleep is only used for testing.
In production this would be Insert/Update/Delete statements. And we only want to commit the transaction when all have been successful: i.e. return an error when any of them fails.

When we use batch():

  • the first promise will resolve after 2 seconds
  • the 2nd promise will reject
  • the 3rd query will still be sent to the databsae and returns after 3 more seconds
  • finally (after a total of 5 seconds), batch is done and we can return an error to the caller.

When we use Promise.all():

  • the first promise will resolve after 2 seconds
  • the 2nd promise will reject - and this will rollback the transaciton and release the database connection
  • now we can already return an error to the caller
  • the 3rd request would fail immediately with Querying against a released or lost connection.. This is anyway expected, so we can igonre it.

So I'd say that Promise.all is better, because:

  • it returns immediately after the first error
  • will not even send the 3rd useless query to the datababase

What am I missing?
Does this maybe this causes other issues: e.g. that a broken connection is returned to the pool, etc.


Solution

  • Method batch caters for scenario where there may be dynamic number of queries created.

    It makes sure that all queries are settled (resolved or rejected), so you do not end up with queries being executed against a closed connection, and getting that Querying against a released or lost connection error. It can be bad/confusing, to start getting those errors occur outside of the context, and you can't diagnose what's going on.

    Method Promise.all does not settle promises, it stops processing and rejects when the first promise in the array rejects.

    And while method batch is still quite useful, as it is more flexible in how it can handle the values, and gives better result/error details than Promise.all, its use today is no longer necessary. It was developed during the ES5 era, when async/await did not exist. But today you can easily replace it with async/await:

    Old style:

    db.task('get-all-records', t => {
        return t.batch([
            t.any('SELECT * FROM apples'),
            t.any('SELECT * FROM oranges')
        ]);
    })
        .then([apples, oranges] => {
            // process data here
        })
        .catch(error => {});
    

    New style:

    const {apples, oranges} = await db.task('get-all-records', async t => {
        const apples = await t.any('SELECT * FROM apples');
        const oranges = await t.any('SELECT * FROM oranges');
        return {apples, oranges};
    });
    

    The result from the two examples above will be identical, though they are not the same in terms of the execution logic, as the first one is fully asynchronous, while the latter uses async/await, which are blocking operations, they prevent you from even creating the next query, if one before fails.

    Extras

    The best-performing approach when it comes to executing multiple independent queries (that do not depend on each other), is by concatenating all queries, and executing them all as one query.

    For that there is method helpers.concat, plus database method multi, to handle multiple results:

    const queries = [
        {query: 'SELECT * FROM apples WHERE color = $1', values: ['green']},
        'SELECT * FROM oranges'
    ];
    const sql = pgp.helpers.concat(queries);
    
    const [apples, oranges] = await db.multi(sql);
    

    You won't even need a transaction for it, unless some of your independent queries change data.