Search code examples
c++stringfilevectorstring-comparison

C++ String comparison not working in vector iteration for word counting algorithm


I am new on programming with c++ and currently trying to create a program to count the amount of each words from a string from .txt file.

My Issue right now is that when I utilized vector to store each words and count the same words with comparison, it sometimes skipped some words.

    for(int i = 0;i<words.size();i++) {  //Using nested for loops to counts the words
        finalWords.push_back(words[i]);//Words that are unique will be counted 
        int counts = 1;
        for(int j = i + 1; j<words.size();j++) {
            if(words[i] == words[j]) {
                counts++;
                words.erase(words.begin() + j); //Removing the words that is not unique
             }
             continue;
         }
         wordCount.push_back(counts);
     }

In my full code, words is a string vector filled with similar words, finalWords are an empty string vector and wordCount is int vector to store the amount of the word from the finalWords vector. I thought the problem are unprinted characters like newline character, but when I checked the input its not the strings nearing line break that the comparison operator failed to compare properly. Is there something I missed? If there is, what do I need to do to fix it?

Thank you in advance!


Solution

  • When you erase the element at index j then the next element will be at index j, not at index j+1.

    The loop should go somewhat like this:

    for(int j = i + 1; j<words.size(); ) {   // no increment here
         if (erasse_it) {
             words.erase(words.begin() + j);
             // no increment here
         } else { 
             ++j;    // increment here
         }
    }
    

    However, as others mentioned your code is unnecessarily compilcated and inefficient.

    You can use a std::unordered_map to count frequencies:

      std::unordered_map<std::string, unsigned> freq;
      for (const auto& word : words) {
           ++freq[word];
      }
    
      for (const auto& f : freq) {
           std::cout << f.first << " appears " << f.second << " times";
      }