Search code examples
javascripttypescriptpromisecancellation

How can I prevent the losing arm of a Promise.race from activating?


I'm sending the same query to a large (well, okay, n=5) number of identical endpoints (a few kubernetes clusters) and collating the results together. I want the requests to go out in parallel and time out after a while; I want failures to be reported to the user without impeding further progress.

As an example of the desired output:

$ node find_broken_pods.js
tomato: error: timed out
Cluster  Pod         Reason
potato   bar-foos    Invalid PVC
potato   foo-eggnog  Invalid image
yoghurt  spam-baz    Invalid name

$ node find_broken_pods.js
Cluster  Pod         Reason
potato   bar-foos    Invalid PVC
potato   foo-eggnog  Invalid image
yoghurt  spam-baz    Invalid name
tomato   what-bat    Insufficient jQuery

The first time around, we failed to get an answer from the tomato cluster for whatever reason, but we were able to enumerate the resources from the other clusters. The second time around, there was no timeout and we were able to list everything.

I previously had come up with this:

export async function queryAll(): Promise<{cluster: string; deployments: V1Pod[]}[]> {
  const out: {cluster: string; result: V1Pod[]}[] = [];

  const promises: Promise<number>[] = [];
  for (const cluster of Object.values(CLUSTERS)) {
    promises.push(
      Promise.race([
        new Promise<number>((_, reject) => setTimeout(() => reject(new Error(`${cluster}: timed out`)), 5000)),
        new Promise<number>((resolve, _) =>
          getAllPods(cluster)
            .then(pods => resolve(out.push({cluster: cluster, result: pods})))
        ),
      ])
    );
  }

  await Promise.all(promises);

  return out;
}

This does run everything in parallel, but a single failure results in the entire function crashing and burning. I figured I could change it to this:

export async function queryAll(): Promise<{cluster: string; deployments?: V1Deployment[]; error?: string}[]> {
  const out: {cluster: string; result?: V1Pod[]; error?: string}[] = [];

  const promises: Promise<number>[] = [];
  for (const cluster of Object.values(CLUSTERS)) {
    promises.push(
      Promise.race([
        new Promise<number>((resolve, _) =>
          setTimeout(() => {
            resolve(out.push({cluster: cluster, error: 'timed out'}));
          }, 5000)
        ),
        new Promise<number>((resolve, _) =>
          getAllPods(cluster)
            .then(pods => resolve(out.push({cluster: cluster, result: pods})))
        ),
      ])
    );
  }

  await Promise.all(promises);

  return out;
}

However, I'm now witnessing both arms of the promise run to completion, that is the out array contains (some) clusters reporting data and either all or none of my clusters reporting a timeout:

  • if no timeout is reached, Promise.all won't wait for the setTimeouts, but the node process will (i.e. if I set the timeout to 60 seconds then the node process will only exit after 60 seconds)
  • if any timeout is reached, Promise.all will wait for the timeout to happen... meaning all setTimeouts end up triggering.

I instead expected that the losing arm of the Promise.race would be killed somehow or prevented from running.

Something tells me my approach is fundamentally broken... how can I improve my fault tolerance?


Solution

  • Yes, you've got both tasks running and pushing objects to the out array. Nothing is stopping them - Promise.race does not magically undo the steps that were taken to create the promises passed to it.

    You can solve this with a flag for each cluster whether the timeout or result has already been reported so that the loser of the race won't report their status.

    However, it's much simpler to just use the result value that the promises fulfill with, which Promise.race already does forward:

    interface QueryResult {cluster: string; deployments?: V1Deployment[]; error?: string}
    export async function queryAll(): Promise<QueryResult[]> {
      const promises: Promise<QueryResult>[] = Object.values(CLUSTERS).map(cluster =>
        Promise.race([
          new Promise(resolve => {
            setTimeout(() => {
              resolve({cluster: cluster, error: 'timed out'});
            }, 5000)
          }),
          getAllPods(cluster).then(pods => ({cluster: cluster, result: pods}))
        ])
      );
    
      const out = await Promise.all(promises);
    
      return out;
    }
    

    With this approach, you even get the results consistently in the same order as the CLUSTERS, not in the order they came back. If you don't want that, you can still take your original approach with push - just do it with the result of each race!

    export async function queryAll(): Promise<QueryResult[]> {
      const out: QueryResult[] = [];
      await Promise.all(Object.values(CLUSTERS).map(cluster =>
        Promise.race([
          new Promise(resolve => {
            setTimeout(() => {
              resolve({cluster: cluster, error: 'timed out'});
            }, 5000)
          }),
          getAllPods(cluster).then(pods => ({cluster: cluster, result: pods}))
        ]).then(result => {
          out.push(result);
        })
      ));
      return out;
    }
    

    Btw, also consider Promise.allSettled instead of Promise.all - you might as well reject from your timeout promise with that.