Search code examples
c++multithreadingboostpure-virtualboost-mutex

Boost Thread start failure if thread object declared as member


i've written a class named Task which encapsulates a boost::thread and allow to override the run() method to do some job on the newly created thread.

Here is the base class:

class Task {


    typedef boost::function<void ()> TaskEventCallback;
    typedef boost::unordered_map<string, TaskEventCallback> Callbacks;
    typedef boost::unordered_map<string, Callbacks> SessionTaskMap;
    typedef boost::unordered_map<TaskListener *, SessionTaskMap> ListenersMap;

    public:
        Task(NGSAppServer& server);
        Task(const Task& orig);
        virtual ~Task();

        virtual void run() = 0;
        bool start();
        bool pause();
        bool cancel();

        virtual bool registerListener(TaskListener *);
        virtual bool unregisterListener(TaskListener *);
        string getProgress();
        string getStatusMessage();

        boost::thread * getThread();

    protected:
        void postEvent(string event);
        void startThread();

        void setProgress(string progress);
        void setStatusMessage(string statusMessage);


        vector<TaskListener *> listeners;
        bool taskRunning;
        bool taskStarted;
        bool taskCanceled;
        bool taskEnded;
        NGSAppServer& server;

        boost::thread worker;
        boost::recursive_mutex mutex;

        ListenersMap listeners_map;

    private:
        string progress;
        string statusMessage;

};

The class is able to post events to multiple http session via the server class but it's not pertinent here. Everything works right, the thread starts and post successfully events, until end of work. This is a working snippet:

        RestoreTask * task = new RestoreTask(application->getServer());
        TaskListener * listener = new TmpTL(*task, progressText, this);
        task->start();

And this is the Restore class:

        class Restore : public Task {

        public:
            Restore(NGSAppServer& server);
            Restore(const Restore& orig);
            virtual ~Restore();

            virtual void run();

        private:
            ... stuffs ...
        };

Now i've attempted to split the work of Restore task in N subTasks (Worker, also a subclass of Task). Here the new run method of Restore:

            std::vector<Worker *> workers;
            for(uint i = 0; i < 2; i++){
                //Start tread
                Worker _worker(this, server);
                _worker.start();
                workers.push_back(&_worker);
            }

            //Join Workers
            for(uint i = 0; i < 2; i++){
                workers.at(i)->getThread()->join();
            }

This code is failing since the start of the child thread create a sigfault while trying to run the Worker class run method since it's reported as pure virtual and moreover the attempt to lock the mutex on the Task base class is failing on this assertion:

void boost::recursive_mutex::lock(): Assertion `!pthread_mutex_lock(&m)' failed.

Directly creating a Worker object and starting it (as for Restore) doesn't create any issue!

On the first sight it may appear that the Restore run() method has ended before children threads, deleting Worker instances and then making the call on run to the base class (pure virtual) and trying to access a destroyed mutex. (PLEASE CORRECT ME IF I AM WRONG HERE!)

Using the debugger to drill down the issue i've found that's not the case. The problem seems to stay inside Worker objects declaration since the following change make the code works without any issue:

            std::vector<Worker *> workers;
            for(uint i = 0; i < 2; i++){
                //Start tread
                Worker * _worker = new Worker(this, server);
                _worker->start();
                workers.push_back(_worker);
            }

            for(uint i = 0; i < 2; i++){
                workers.at(i)->getThread()->join();
                delete workers.at(i);
            }

I'd prefer to create Workers without the new operator since i don't really need to keep these objects alive after Restore::run() has finished and I should be able to guarantee these objects will still exists until children completion due to the threads join (which has been verified with debugger).

Who can find the bottleneck here?

I've been able just to find a way to run my code, the solution (but for me more important is the explanation here) is still missing.

Best Regards


Solution

  • The _worker will get out out of scope and destructed when for loop is repeated. You can put a print in the destructor to verify this.

    What you do in the second ( the new ..delete) is the correct thing, perhaps you can use the smart_ptr / make_ptr to avoid deletes.

    You can also make a array of Workers instead of for-loop, in which case you'll have to use default ctor, and pass initializers ( this, start ) in some other way