c++multithreading boost-asio waitformultipleobjects

What's the difference between WaitForMultipleObjects and boost::asio on multiple windows::basic_handle's?

I have a list of HANDLE's, controlled by a lot of different IO devices. What would be the (performance) difference between:

A call to WaitForMultipleObjects on all these handles
async_read on boost::windows::basic_handle's around all these handles

Is WaitForMultipleObjects O(n) time complex with n the amount of handles?
You can somehow call async_read on a windows::basic_handle right? Or is that assumption wrong?
If I call run on the same IO device in multiple threads, will the handling-calls be balanced between those threads? That would be a major benefit of using asio.

Solution

since it sounds like the main use you would derive from asio is the fact that it is built on top of IO completion ports (iocp for short). So, let's start with comparing iocp with WaitForMultipleObjects(). These two approaches are essentially the same as select vs. epoll on linux.

The main drawback of WaitForMultipleObjects that was solved by iocp is the inability to scale with many file descriptors. It is O(n), since for each event you receive you pass in the full array again, and internally WaitForMultipleObjects must scan the array to know which handles to trigger on.

However, this is rarely a problem because of the second drawback. WaitForMultipleObjects() has a limit on the max number of handles it can wait on (MAXIMUM_WAIT_OBJECTS). This limit is 64 objects (see winnt.h). There are ways around this limit by creating Event objects and tying multiple sockets to each event, and then wait on 64 events.

The third drawback is that there's actually a subtle "bug" in WaitForMultipleObjects(). It returns the index of the handle which triggered an event. This means it can only communicate a single event back to the user. This is different from select, which will return all file descriptors that triggered an event. WaitForMultipleObjects scans the handles passed in to it and return the first handle that has its event raised.

This means, if you are waiting on 10 very active sockets, all of which has an event on them most of the time, there will be a very heavy bias toward servicing the first socket in the list passed in to WaitForMultipleObjects. This can be circumvented by, every time the function returns and the event has been serviced, run it again with a timeout of 0, but this time only pass in the part of the array 1 past the event that triggered. Repeatedly until all handles has been visited, then go back to the original call with all handles and an actual timeout.

iocp solves all of these problems, and also introduces an interface for a more generic event notification, which is quite nice.

With iocp (and hence asio):

you don't repeat which handles you're interested in, you tell windows once, and it remembers it. This means it scales a lot better with many handles.
You don't have a limit of the number of handles you can wait on
You get every event, i.e. there's no bias towards any specific handle

I'm not sure about your assumption of using async_read on a custom handle. You might have to test that. If your handle refers to a socket, I would imagine it would work.

As for the threading question; yes. If you run() the io_service in multiple threads, events are dispatched to a free thread, and will scale with more threads. This is a feature of iocp, which even has a thread pool API.

In short: I believe asio or iocp would provide better performance than simply using WaitForMultipleObjects, but whether or not that performance will benefit you mostly depends on how many handles you have and how active they are.