For context, I'm writing an application in Python that needs to poll many hosts continuously, so I create a large number of sockets to communicate with those hosts. However, I can only create 511 sockets - when I try to create 512, I get a ValueError: too many file descriptors in select()
. I thought this error was referencing the maximum amount of file descriptors that a process can have open at any given time, but when I try increasing that maximum with Python's win32file._setmaxstdio()
, it has no effect - no matter what I set the limit to, I can only create 511 sockets. I even tried setting the limit to a value lower than 512 just to see if it would change anything, but I could still create 511 sockets! So as far as I can tell, the limits referenced by _setmaxstdio()
and _getmaxstdio()
are completely unrelated to the limit to how many sockets/file descriptors select()
can handle.
I tried investigating Python's select
module to see if I could find where select()
's maximum is defined, or how to increase it. Python's documentation for the select.select()
function doesn't mention either of those things, but it does mention that select()
comes from Windows' Winsock library. So I checked Microsoft's documentation of the select()
function:
Four macros are defined in the header file Winsock2.h for manipulating and checking the descriptor sets. The variable FD_SETSIZE determines the maximum number of descriptors in a set. (The default value of FD_SETSIZE is 64, which can be modified by defining FD_SETSIZE to another value before including Winsock2.h.)
I read this to mean "select()
can handle 64 sockets by default, but you can change that by altering the value of FD_SETSIZE
before you include the header file". So I assume Python sets it to 512 before including the Winsock2 header file? Or is select()
's limit set some other way?
I just want to know where the select()
function's limit is defined, how I can check it, and if it can be increased from within Python, but I'm clearly missing something fundamental here. select()
can handle some number of file descriptors, and _setmaxstdio()
is used to "[set] a maximum for the number of simultaneously open files at the stream I/O level", but changing the limit with _setmaxstdio()
doesn't affect the limit for select()
. Why not? If select()
isn't limited by the maximum amount of file descriptors you're allowed to have, then what is it limited by?
A few days later, my best understanding of the situation is this: the per-process limit to number of file descriptors IS controlled by _setmaxstdio()
, and I was using it correctly, BUT that upper limit set by _setmaxstdio()
does not apply if you use the select()
function, because select()
has a hardcoded limit. In order for the limit you set with _setmaxstdio()
to have an actual effect, you must use poll()
, epoll()
, etc., instead of select()
. And if there is a way to increase the limits of the select()
function, it seems like you need to recompile part of Windows' C runtime, which is not a good idea. Since I have the option to drop support for Windows and only support Unix instead, I'd much rather just do it that way.
If you're reading this answer because you want to increase the limits of Windows' select()
function, and like me, you are new to the concept of file descriptors/completion ports, and haven't extensively worked with sockets before, and don't want to/don't know how to screw around with the C runtime (CRT), I suggest you first consider the following before continuing trying to alter the select()
limit:
poll()
or epoll()
instead of select()
? In my case, the only reason I was using select()
is because I was locked into because of the libraries I'm using - asyncio and psycopg. At the time of writing, psycopg does not support asyncio's ProactorEventLoop
in Windows, so I was forced to instead use the SelectorEventLoop
(which uses select()
) instead of the default ProactorEventLoop
, which doesn't use select()
. However, psycopg has no such limitation in Unix - it will work with any asyncio event loop there. So if you can use poll()
, epoll()
, etc instead of select()
, then that's probably a lot easier than trying to actually increase the select()
limit since then you'll be able to just use _setmaxstdio()
to set the file descriptor limit, and it'll actually let you have more file descriptors.Instead of continuing to try to make this work with Windows, I'm just going to use Unix instead - I probably should have done that from the beginning since it's a backend kind of process, but the convenience of being able to run/test/debug the code directly on my development machine was tempting, and I thought that the added flexibility of supporting both Windows and Unix would be worth the (what I thought would be minimal) overhead of adding if os is Windows use SelectorEventLoop
to the start of my application.