Batching futures in Dart

I want to batch many futures into a single request that triggers either when a maximum batch size is reached, or a maximum time since the earliest future was received is reached.

Motivation

In flutter, I have many UI elements which need to display the result of a future, dependent on the data in the UI element.

For instance, I have a widget for a place, and a sub-widget which displays how long it will take to walk to a place. To compute the how long it will take to walk, I issue a request to Google Maps API to get the travel time to the place.

It is more efficient and cost-effective to batch all these API requests into a batch API request. So if there are 100 requests made instantaneously by the widgets, then the futures could be proxied through a single provider, which batches the futures into a single request to Google, and unpacks the result from Google into all the individual requests.

The provider needs to know when to stop waiting for more futures and when to actually issue the request, which should be controllable by the maximum "batch" size (i.e., # of travel time requests), or the maximum amount of time you are willing to wait for batching to take place.

The desired API would be something like:


// Client gives this to tell provider how to compute batch result.
abstract class BatchComputer<K,V> {
  Future<List<V>> compute(List<K> batchedInputs);
}

// Batching library returns an object with this interface
// so that client can submit inputs to completed by the Batch provider.
abstract class BatchingFutureProvider<K,V> {
  Future<V> submit(K inputValue);
}

// How do you implement this in dart???
BatchingFutureProvider<K,V> create<K,V>(
   BatchComputer<K,V> computer, 
   int maxBatchSize, 
   Duration maxWaitDuration,
);

Does Dart (or a pub package) already provide this batching functionality, and if not, how would you implement the create function above?

Solution

This sounds perfectly reasonable, but also very specialized. You need a way to represent a query, to combine these queries into a single super-query, and to split the super-result into individual results afterwards, which is what your BatchComputer does. Then you need a queue which you can flush through that under some conditions.

One thing that is clear is that you will need to use Completers for the results because you always need that when you want to return a future before you have the value or future to complete it with.

The approach I would choose would be:

import "dart:async";

/// A batch of requests to be handled together.
///
/// Collects [Request]s until the pending requests are flushed.
/// Requests can be flushed by calling [flush] or by configuring
/// the batch to automatically flush when reaching certain 
/// tresholds.
class BatchRequest<Request, Response> {
  final int _maxRequests;
  final Duration _maxDelay;
  final Future<List<Response>> Function(List<Request>) _compute;
  Timer _timeout;
  List<Request> _pendingRequests;
  List<Completer<Response>> _responseCompleters;

  /// Creates a batcher of [Request]s.
  ///
  /// Batches requests until calling [flush]. At that pont, the
  /// [batchCompute] function gets the list of pending requests,
  /// and it should respond with a list of [Response]s.
  /// The response to the a request in the argument list
  /// should be at the same index in the response list, 
  /// and as such, the response list must have the same number
  /// of responses as there were requests.
  ///
  /// If [maxRequestsPerBatch] is supplied, requests are automatically
  /// flushed whenever there are that many requests pending.
  ///
  /// If [maxDelay] is supplied, requests are automatically flushed 
  /// when the oldest request has been pending for that long. 
  /// As such, The [maxDelay] is not the maximal time before a request
  /// is answered, just how long sending the request may be delayed.
  BatchRequest(Future<List<Response>> Function(List<Request>) batchCompute,
               {int maxRequestsPerBatch, Duration maxDelay})
    : _compute = batchCompute,
      _maxRequests = maxRequestsPerBatch,
      _maxDelay = maxDelay;

  /// Add a request to the batch.
  ///
  /// The request is stored until the requests are flushed,
  /// then the returned future is completed with the result (or error)
  /// received from handling the requests.
  Future<Response> addRequest(Request request) {
    var completer = Completer<Response>();
    (_pendingRequests ??= []).add(request);
    (_responseCompleters ??= []).add(completer);
    if (_pendingRequests.length == _maxRequests) {
      _flush();
    } else if (_timeout == null && _maxDelay != null) {
      _timeout = Timer(_maxDelay, _flush);
    }
    return completer.future;
  }

  /// Flush any pending requests immediately.
  void flush() {
    _flush();
  }

  void _flush() {
    if (_pendingRequests == null) {
      assert(_timeout == null);
      assert(_responseCompleters == null);
      return;
    }
    if (_timeout != null) {
      _timeout.cancel();
      _timeout = null;
    }
    var requests = _pendingRequests;
    var completers = _responseCompleters;
    _pendingRequests = null;
    _responseCompleters = null;

    _compute(requests).then((List<Response> results) {
      if (results.length != completers.length) {
        throw StateError("Wrong number of results. "
           "Expected ${completers.length}, got ${results.length}");
      }
      for (int i = 0; i < results.length; i++) {
        completers[i].complete(results[i]);
      }
    }).catchError((error, stack) {
      for (var completer in completers) {
        completer.completeError(error, stack);
      }
    });
  }
}

You can use that as, for example:

void main() async {
  var b = BatchRequest<int, int>(_compute, 
      maxRequestsPerBatch: 5, maxDelay: Duration(seconds: 1));
  var sw = Stopwatch()..start();
  for (int i = 0; i < 8; i++) {
    b.addRequest(i).then((r) {
      print("${sw.elapsedMilliseconds.toString().padLeft(4)}: $i -> $r");
    });
  }
}
Future<List<int>> _compute(List<int> args) => 
    Future.value([for (var x in args) x + 1]);