How to handle a situation that requires to release a lock during a blocking call?

I have some "FreeOnTerminate" worker threads which add their handles on a TThreadList when they start executing and remove from the same when their execution ends. They also check on a global event object that would notify them to cancel their work.

Following is the part that runs in the main thread which signals the event and waits for possible worker threads to end. WorkerHandleList is the global ThreadList.

...

procedure WaitForWorkers;
var
  ThreadHandleList: TList;
begin
  ThreadHandleList := TWorkerThread.WorkerHandleList.LockList;
  TWorkerThread.WorkerHandleList.UnlockList;
  WaitForMultipleObjects(ThreadHandleList.Count,
      PWOHandleArray(ThreadHandleList.List), True, INFINITE);
end;

initialization
  TWorkerThread.RecallAllWorkers := TEvent.Create;
  TWorkerThread.WorkerHandleList := TThreadList.Create;

finalization
  TWorkerThread.RecallAllWorkers.SetEvent;
  WaitForWorkers;

  TWorkerThread.RecallAllWorkers.Free;
  TWorkerThread.WorkerHandleList.Free;

This design, I think, has a flaw in that I have to unlock the list just before waiting on the threads' handles because that would cause a deadlock since the threads themselves remove their handles from the same list. Without any lock, a context switch could cause a thread to free itself causing WaitForMultipleObjects to return immediately with WAIT_FAILED. I can't employ another lock either since WaitForMultipleObjects is blocking and I wouldn't be able to release the lock from the main thread.

I can modify this design in a number ways including not using FreeOnTerminate threads, which would guarantee valid handles until they are explicitly freed. Or modifying the list of thread handles only from the main thread. Or probably others...

But what I want to ask is, is there a solution to this kind of problem without changing the design? For instance, would sleeping in worker thread code before they remove their handles from the list, or calling SwitchToThread cause all non-worker threads have a run? Enough run?

Solution

Your use of LockList() is wrong, and dangerous. As soon as you call UnlockList(), the TList is no longer protected, and will be modified as the worker threads remove themselves from the list. That can happen before you have a chance to call WaitForMultipleObjects(), or worse WHILE setting up the call stack for it.

What you need to do instead is lock the list, copy the handles to a local array, unlock the list, and then wait on the array. DO NOT wait on the TList itself directly.

procedure WaitForWorkers;
var
  ThreadHandleList: TList;
  ThreadHandleArr: array of THandle;
begin
  ThreadHandleList := TWorkerThread.WorkerHandleList.LockList;
  try
    SetLength(ThreadHandleArr, ThreadHandleList.Count);
    for I := 0 to ThreadHandleList.Count-1 do
      ThreadHandleArr[i] := ThreadHandleList[i];
  finally
    TWorkerThread.WorkerHandleList.UnlockList;
  end;

  WaitForMultipleObjects(Length(ThreadHandleArr), PWOHandleArray(ThreadHandleArr), True, INFINITE);
end;

However, even that has a race condition. Some of the worker threads may have already terminated, and thus destroyed their handles, before WaitForMultipleObjects() is actually entered. And the remaining threads will destroy their handles WHILE it is running. Either way, it fails. You CANNOT destroy the thread handles while you are actively waiting on them.

FreeOnTerminate=True can only be used safely for threads that you start and then forget even exist. It is very dangerous to use FreeOnTerminate=True when you still need to access the threads for any reason (it is especially because of this caveat that TThread.WaitFor() tends to crash when FreeOnTerminate=True - the thread handle and even the TThread object itself gets destroyed while it is still being used!).

You need to re-think your waiting strategy. I can think of a few alternatives:

don't use WaitForMultipleObjects() at all. It is safer, but less efficient, to simply re-lock the list periodically and check if it is empty or not:

procedure WaitForWorkers;
var
  ThreadHandleList: TList;
begin
  repeat
    ThreadHandleList := TWorkerThread.WorkerHandleList.LockList;
    try
      if ThreadHandleList.Count = 0 then Exit;
    finally
      TWorkerThread.WorkerHandleList.UnlockList;
    end;
    Sleep(500);
  until False;
end;

Get rid of WorkerHandleList altogether and use a semaphore or interlocked counter instead to keep track of how many threads have been created and have not been destroyed yet. Exit the wait when the semaphore/counter indicates that no more threads exist.
like Ken B suggested, keep using WorkerHandleList but wait on a manual-reset event that gets reset when the first thread is added to the list (do that in the thread constructor, not in Execute()) and signaled when the last thread is removed from the list (do that in the thread destructor, not in Execute() or DoTerminate()).