Search code examples
c++performanceorganization

C++ How would I make this code more efficient?


I have an array of words, and I have a text file. What I want to do is use the array of words and search through the text file, count the number of times each word in the array appears in the text file.

I have thought about using a For Loop but that just gave me the total of the word count not the individual word count for each. I can't put the text file into an array as there is about 40000 words in the text file.

After the count, I want to divide each count by a integer value known as 'scale'. And then mulitply a string by the new count number.

So I am currently doing it as shown below. Is there anyway I can make this more efficient?

Any help is greatly appreciated.

Array of words = testwords.

Name of file = testF.

inWord = each word in the file.

while(testF >> inWord)
    {if (inWord == testwords[0]){
            count1++;
            }
        if (inWord == testwords[1]){
            count2++;
            }
        if (inWord == testwords[2]){
            count3++;
            }
        if (inWord == testwords[3]){
            count4++;
            }
        if (inWord == testwords[4]){
            count5++;
            }
        if (inWord == testwords[5]){
            count6++;
            }
        if (inWord == testwords[6]){
            count7++;
            }
        if (inWord == testwords[7]){
            count8++;
            }
}
cout << testwords[0] << " " << count1 << " " << s1.append(count1/scale, '*') << endl;
cout << testwords[1] << " " << count2 << " " << s2.append(count2/scale, '*') << endl;
cout << testwords[2] << " " << count3 << " " << s3.append(count3/scale, '*') << endl;
cout << testwords[3] << " " << count4 << " " << s4.append(count4/scale, '*') << endl;
cout << testwords[4] << " " << count5 << " " << s5.append(count5/scale, '*') << endl;
cout << testwords[5] << " " << count6 << " " << s6.append(count6/scale, '*') << endl;
cout << testwords[6] << " " << count7 << " " << s7.append(count7/scale, '*') << endl;
cout << testwords[7] << " " << count8 << " " << s8.append(count8/scale, '*') << endl;

Solution

  • Before you worry about efficiency, you should worry about approach. You're not using logical data structures. Instead of having 8 separate counts, keep an array of counts. Or better yet, keep a map of word -> count.

    Lucky in this situation, cleaner code will correspond to much faster execution.

    In particular, use an std::map<std::string, size_t>.

    Alternatively, if you're using C++11, you could use a std::unordered_map for likely better performance.

    Assuming you're reading your words from cin:

    std::map<std::string, size_t> counts;
    
    std::string word;
    
    while (std::cin >> word) {
        ++counts[word];
    }
    
    for (std::map<std::string, size_t::const_iterator it = counts.begin(),
         end = counts.end(); it != end; ++it) {
        std::cout << "The word '" << it->first << " appeared " 
                  << it->second << " times" << std::endl;
    }
    

    Documentation for std::map.

    Documentation for std::unordered_map.

    For what it's worth, std::unordered_map is (pretty assumably always) implemented as a hash map, and std::map is implemented (pretty assumably always) using a balanced binary tree as the backing structure.