Assuming that I have a CPU with 4 cores and 4 threads, does it make sense to run e.g. 8 PHP-FPM workers by setting pm.max_children = 8
option? As far as I'm concerned, CPU with 4 threads can only run up to 4 processes in "real" parallel. Wouldn't it cause an overhead if CPU time was lost due to contexts switching between these 8 processes?
In contrast, Node.js cluster mode documentation recommends to run up to as many workers/children as number of cores. Doesn't the same recommendation apply here?
The general answer is yes, because although you can't run that many threads in parallel you can run them concurrently.
The key thing to understand is that in most real applications, a lot of the time spent processing a request is not spent using the local CPU - it's spent waiting for database queries, external APIs, even disk access. If you have one thread per CPU core, the CPU is simply sitting idle all that time. Allow additional threads, and one can be using the CPU while another is waiting for external data.
Only if your application is highly unusual and spending 100% of its time using the CPU would limiting to one thread per core make sense.
The reason this doesn't apply to node.js is that it implements concurrency within a single thread using asynchronous code: you can tell the current thread "start doing this, and while waiting for the result, get on with processing a different request". This isn't possible with native PHP, which uses a "shared nothing" approach - each request gets its own thread or process - but there are projects such as Swoole and Amp that add support for this asynchronous approach.