Search code examples
multithreadingdelphicritical-section

Why does my multi-threaded app sometimes hang when it closes?


I'm using several critical sections in my application. The critical sections prevent large data blobs from being modified and accessed simultaneously by different threads.

AFAIK it's all working correctly except sometimes the application hangs when exiting. I'm wondering if this is related to my use of critical sections.

Is there a correct way to free TCriticalSection objects in a destructor?

Thanks for all the answers. I'm looking over my code again with this new information in mind. Cheers!


Solution

  • As Rob says, the only requirement is that you ensure that the critical section is not currently owned by any thread. Not even the thread about to destroy it. So there is no pattern to follow for correctly destroying a TCriticalSection, as such. Only a required behaviour that your application must take steps to ensure occurs.

    If your application is locking then I doubt it is the free'ing of any critical section that is responsible. As MSDN says (in the link that Rob posted), the DeleteCriticalSection() (which is ultimately what free'ing a TCriticalSection calls) does not block any threads.

    If you were free'ing a critical section that other threads were still trying to access you would get access violations and other unexpected behaviours, not deadlocks, as this little code sample should help you demonstrate:

    implementation
    
    uses
      syncobjs;
    
    
      type
        tworker = class(tthread)
        protected
          procedure Execute; override;
        end;
    
    
      var
        cs: TCriticalSection;
        worker: Tworker;
    
    
    procedure TForm2.FormCreate(Sender: TObject);
    begin
      cs := TCriticalSection.Create;
    
      worker := tworker.Create(true);
      worker.FreeOnTerminate := TRUE;
      worker.Start;
    
      sleep(5000);
    
      cs.Enter;
    
      showmessage('will AV before you see this');
    end;
    
    { tworker }
    
    procedure tworker.Execute;
    begin
      inherited;
      cs.Free;
    end;
    

    Add to the implementation unit of a form, correcting the "TForm2" reference for the FormCreate() event handler as required.

    In FormCreate() this creates a critical section then launches a thread whose sole purpose is to free that section. We introduce a Sleep() delay to give the thread time to initialise and execute, then we try to enter the critical section ourselves.

    We can't of course because it has been free'd. But our code doesn't hang - it is not deadlocked trying to access a resource that is owned by something else, it simply blows up because, well, we're trying to access a resource that no longer exists.

    You could be even more sure of creating an AV in this scenario by NIL'ing the critical section reference when it is free'd.

    Now, try changing the FormCreate() code to this:

      cs := TCriticalSection.Create;
    
      worker := tworker.Create(true);
      worker.FreeOnTerminate := TRUE;
    
      cs.Enter;
      worker.Start;
    
      sleep(5000);
    
      cs.Leave;
    
      showmessage('appearances can be deceptive');
    

    This changes things... now the main thread will take ownership of the critical section - the worker thread will now free the critical section while it is still owned by the main thread.

    In this case however, the call to cs.Leave does not necessarily cause an access violation. All that occurs in this scenario (afaict) is that the owning thread is allowed to "leave" the section as it would expect to (it doesn't of course, because the section has gone, but it seems to the thread that it has left the section it previously entered) ...

    ... in more complex scenarios an access violation or other error is possibly likely, as the memory previously used for the critical section object may be re-allocated to some other object by the time you call it's Leave() method, resulting in some call to some other unknown object or access to invalid memory etc.

    Again, changing the worker.Execute() so that it NIL's the critical section ref after free'ing it would ensure an access violation on the attempt to call cs.Leave(), since Leave() calls Release() and Release() is virtual - calling a virtual method with a NIL reference is guaranteed to AV (ditto for Enter() which calls the virtual Acquire() method).

    In any event:

    Worst case: an exception or weird behaviour

    "Best" case: the owning thread appears to believe it has "left" the section as normal.

    In neither case is a deadlock or a hang going to occur simply as the result of when a critical section is free'd in one thread in relation to when other threads then try to enter or leave that critical section.

    All of which is a round-a-bout way of saying that it sounds like you have a more fundamental race condition in your threading code not directly related to the free'ing of your critical sections.

    In any event, I hope my little bit of investigative work might set you down the right path.