Search code examples
c++multithreadingword-count

Effective wordcount multithreading in C++


I am fairly new to C++ and multi-threading and need some help with creating a word count that effectively divides work between multiple threads.

Suppose, I have a function that counts words in a line (string):

count_words_in_line(line);

For one thread, the total words in line is the simple sum of this function output for each line, but how do I divide that into threads?

My idea was to use two threads - one to count even and one to count odd lines, but the code results in a Segmentation Fault.

What am I doing wrong? Is there a better approach?

I would like to not use a threadpool, and would ideally like to specify the number of threads in an argument in order to measure the performance of the multithreaded implementation.

Here is my relevant code:

bool odd = true;
auto thread_count_odd = [&counter, &infile, &odd, &line, &mutex]() {
    while (std::getline(infile, line)) {
        if (odd) {
            std::cout<<"Count odd"<<std::endl;
            mutex.lock();
            counter += count_words_in_line(line);
            mutex.unlock();
        }
        odd = !odd;
    }
};

bool even = false;
auto thread_count_even = [&counter, &infile, &even, &line, &mutex]() {

    while (std::getline(infile, line)) {
        if (even) {
            std::cout<<"Count even"<<std::endl;
            mutex.lock();
            counter += count_words_in_line(line);
            mutex.unlock();
        }
        even = !even;
    }
};

std::thread t1(thread_count_odd);
std::thread t2(thread_count_even);

t1.join();
t2.join();

Solution

  • I think the issue is that you must have a mutex around getline call. Both threads are accessing infile at the same time which might be causing issues.

    I have this code which would work for your situation using conditional variables. Hope this helps

    `

    #include<iostream>
    #include<thread>
    #include<string>
    #include<mutex>
    #include<condition_variable>
    #include<unistd.h>
    #include <fstream>
    #define MAX_THREADS 50
    using namespace std;
    thread *threads = new thread[MAX_THREADS];
    
    condition_variable cv[MAX_THREADS];
    mutex m1;
    int counter=0;
    int count_words_in_line(string line){
            /*write your code here*/
            return 1;
    }
    
    void printString(int tid, ifstream &inFile, int tcount)
    {
        unique_lock<mutex> lock(m1);
        while(1)
        {
            string line;
            inFile >> line;
            string a = "";
            if(line==a)break;
            cv[(tid+1)%tcount].notify_one();
            cv[tid].wait(lock);
            counter += count_words_in_line(line);
        }
        cv[(tid+1)%tcount].notify_one();
    }
    
    int main(int argc, char** argv)
    {
        int tcount, ccount, k;
        std::ifstream inFile;
        string name;
        inFile.open("input.txt");
        string str;
        tcount = 2;
    
        for(int i = 0; i < tcount; i++) {
            threads[i] = thread(printString, i, ref(inFile), tcount);
        }
    
        for (int i = 0; i < tcount; i++)
            threads[i].join();
    
        cout << counter << endl;
        return 0;
    }
    

    `