Search code examples
c++boostfuture

wait_for_any/when_any/WaitAny/WhenAny: What is the correct behavior when passed zero futures/tasks?


There are 4 design options to choose when when_any is passed zero futures, unfortunately all of them make sense.
Till now I am able to

  1. summarize some weak arguments for each design option;
  2. list some implementations and which design options they choose.

Design option 1: when_any(zero future<T>s) should return a future that blocks forever.

The first reason is for the simplicity and uniformity of definition.

if when_any(zero futures) return a future that blocks forever, we get:

wait_all blocks while some is unfinished, until all is finished;
wait_any unblocks if some is finished, not if all is unfinished;

if when_any(zero futures) return a future that is immediately ready we get:

wait_all blocks while some is unfinished, until all is finished;
wait_any unblocks if some is finished, not if all is unfinished, but unblocks if there are zero futures;

The second reason is that when_any is an associative and commutative binary operation, so for variadic version of when_any we want to return the identity element of the when_any operation (which is a future that blocks forever). And in languages where you can define your own binary operators (may be C++ will do that in the future) or where the std::accumulate algorithm is supported, you will still encounter this identity element problem sooner or later.

when_all is like operator&&, and in parameter pack expansion the empty pack expands to true for operator&&, true is like a future that is immediately ready.
when_any is like operator||, and in parameter pack expansion the empty pack expands to false for operator||, false is like a future that is never ready.

(The other places where we need the identity element are:

  • boost::thread::null_mutex to std::scoped_lock (std::scoped_lock is like an associative and commutative binary operation that consumes smaller locks and produce larger lock),
  • std::monostate to std::variant (std::variant is like an associative and commutative binary operation that consumes smaller unions and produce larger union),
  • empty set to operator| in regexp (an operator| node having zero children can happen if you write a program that convert NFA to regexp),
  • empty string to operator concat in regexp,
  • ...

)

How should we treat divergence?

a program might:

  • produce a value;
  • produce an error (the state of the program falls out of the domain of the evaluation function);
  • diverge (for every state X, there exists a state Y that: X ---[evaluation function]--> Y);

Divergence is not a value, divergence is not an error, divergence is divergence.
There are programs that diverges (never terminates), such as operating system, protocol stack, database service and web server.
There are ways to deal with divergence. In C# we have cancelation functionality and progress report functionality. In C++ we can interrupt an execution agent (boost.thread) or destroy an execution agent (boost.context, boost.fiber). We can use thread safe queues or channels to send/receive values/errors to/from an actor continuously.

use case for divergence①:

A programmer uses library1 to consult some unreliable web service on an unreliable network. library1 retries forever because network is unreliable. library1 itself shouldn't store an exception in the shared state when some timeout expires because:

  1. the application layer progrmmers may want to use different cancelation machanisms:
  • when a timeout expires, or
  • when the user clicks a button, or
  • the user clicks a button to start a timeout and when the timeout expires;
  1. the application layer programmers may want to do different things on cancelation:
  • provide a default value, or
  • provide an exception, or
  • some programmers are not at the uppermost layer so they should not attach a cancelation machanism;

Anyway the programmer must use when_any to merge the potentially-block-forever future with his own my-cancelation/fallback-machanism future to get a larger future and the bigger future will not diverge now.
(Assume when_any(several future<T>...) returns future<T> so we don't have to write boilerplate code at every intermidiate when_any node in the future tree.)
(Some modifications required: (1) the bigger future that when_any returns should destroy other child futures when the first child future becomes ready; (2) library1 should use the promise object to check if(shared_state.reference_count == 1) and get to know that the consumer has abandoned the future (i.e. the operation is canceled), and quit that loop;)

use case for divergence②:

A programmer uses library2 to consult some unreliable web service on an unreliable network. library2 retries for n times and then blocks forever, not physically, but logically by setting a bit in the shared_state (shared_state.diverge = true or stared_state.state = state_t::diverge). The programmer use when_any to merge the future from library2 and the my-cancelation/fallback-machanism future. The first ready child future indicates the result. Suppose a failed child future becomes ready with an exception instead of blocking forever, then it answers the bigger future in place of the successful child future that becomes ready later, which is not desired.
(Assume when_any(several future<T>...) returns future<T> so we don't have to write boilerplate code at every intermidiate when_any node in the future tree.)

use case for divergence③:

When testing network code, use a future that is never ready to stand for a client that has very poor network condition, use a future that is immediately ready to stand for a client that has very good network condition, use futures with various timeouts to stand for clients that falls in between.
(Some modifications required: (1) add make_diverging_future; (2) add make_timeout_ready_future;)

Design option 2: when_any(zero future<T>s) should return a future that contains an exception.

c++ - Non-polling implementation of std::when_any() - Code Review Stack Exchange https://codereview.stackexchange.com/questions/176168/non-polling-implementation-of-stdwhen-any

The Concurrency TS's when_any philosophically-incorrectly returns a ready future when called with zero arguments. My version doesn't treat that case specially, and so the natural behavior falls out: the internal promise is destroyed before any 1 of the 0 provided futures has become ready, and so when_any(/*zero args*/) returns a ready future whose get() will throw broken_promise.

I think this is a case of "Fail Early, Fail Loudly". Since he is treating divergence as an error, things become problematic in the above use cases.

Design option 3: when_any(zero future<T>s) should return a future that contains a value of ???.

Design option 4: when_any(zero future<T>s) should be forbidden.

The last 3 design options are used by the standards and libraries. I'll try to guess their motivation below.

Here are some implementations and their behaviours on *_all and *_any:
functions for CPU-bound programs:(if you have trouble reading the table, go to edit mode)

where function behavior when passed zero tasks
boost.thread *_all void wait_for_all(...) return void
boost.thread *_any iterator wait_for_any(...) return the end iterator
boost.fiber *_all void wait_all_simple(...) refused at compile time
vector<R> wait_all_values(...) refused at compile time
vector<R> wait_all_until_error(...) refused at compile time
vector<R> wait_all_collect_errors(...) refused at compile time
R wait_all_members(...) return a value of R
boost.fiber *_any void wait_first_simple(...) return void
R wait_first_value(...) refused at compile time
R wait_first_outcome(...) refused at compile time
R wait_first_success(...) refused at compile time
variant<R> wait_first_value_het(...) refused at compile time
System.Threading.Tasks *_all void Task.WaitAll(...) return void
System.Threading.Tasks *_any int Task.WaitAny(...) return -1

functions for IO-bound programs:

where function behavior when passed zero tasks
std::experimental *_all future<sequence<future<T>>> when_all(...) return a future storing empty sequence
std::experimental *_any future<...> when_any(...) return a future storing { size index = -1, sequence<future<T>> sequence = empty sequence }
boost.thread *_all future<sequence<future<T>>> when_all(...) return a future storing empty sequence
boost.thread *_any future<sequence<future<T>>> when_any(...) return a future storing empty sequence
System.Threading.Tasks *_all Task<TResult[]> Task.WhenAll(...) return a future storing empty sequence
System.Threading.Tasks *_any Task<Task<TResult>> Task.WhenAny(...) refused at run time (throw ArgumentException)

(System.Threading.Tasks.WaitAny(...) accepts zero futures but System.Threading.Tasks.WhenAny(...) refuses at run time.)

The reason why we should not allow when_any(zero tasks) to return a future that blocks forever is perhaps practicality. If we allow that, we open a hole in future's interface saying that every future potentially diverges, so every application layer programmer has to use when_any to merge the future with my-cancelation/fallback-machanism future to get a bigger never-block future if he lacks further information, which is tedious. If we disallow that, we will protect those teams where not all interfaces are documented in detail (let me make an analogy: imagine you are in a C++ company where library functions receive and return potentially-nullptr pointers instead of optional<reference_wrapper<T>> and reference_wrapper<T>, without more information or documentation you have to guard every member access expression with if(p), which is tedious; similarly with futures we have to do when_any(future_returned_from_library, std::make_timeout_future(1s)) everywhere). So we'd better keep the interfaces and possibilities as small as possible.

(apologize to Alan Birtles: I'm sorry I made a mistake that day about the implementations listed: the wait_any functions of boost.fiber other than the first one forbids zero futures and there's a individual implementation that returns a future storing broken_promise (https://codereview.stackexchange.com/questions/176168/non-polling-implementation-of-stdwhen-any) so I am trying to summarize those in a new question.)


Solution

  • I'd go for the standard solution: https://en.cppreference.com/w/cpp/experimental/when_any

    1. If the range is empty (i.e., first == last), the returned future is ready immediately; the futures field of the when_any_result is an empty vector, and the index field is size_t(-1).
    2. If no argument is provided, the returned future is ready immediately; the futures field of the when_any_result is an empty tuple, and the index field is size_t(-1).

    Note also that most of the other "potentially desirable behaviours" can probably easily be composed just by adding an "empty" seed future to any received list:

     auto my_when_any = [](auto... f) {
         return when_any(EmptyT{}, f...);
     };
    

    Here EmptyT could be a future that is always ready, is never ready, or holds an exception depending on your preference.

    This is very similar to e.g. fold expressions where you decide the monoidal "seed": (false || ... || pack) vs. (true && ... && pack) as you referenced.