Search code examples
c++filecountwords

C++ Counting words in a file between two words


I am currently trying to count the number of words in a file. After this, I plan to make it count the words between two words in the file. For example. My file may contain. "Hello my name is James". I want to count the words, so 5. And then I would like to count the number of words between "Hello" and "James", so the answer would be 3. I am having trouble with accomplishing both tasks. Mainly due to not being exactly sure how to structure my code. Any help on here would be greatly appreciated. The code I am currently using is using spaces to count the words.

Here is my code:

readwords.cpp

string ReadWords::getNextWord()
{
    bool pWord = false;
    char c;
    while((c = wordfile.get()) !=EOF)
    {
        if (!(isspace(c)))
        {
            nextword.append(1, c);
        }

        return nextword;
    }
}

bool ReadWords::isNextWord()
{
    if(!wordfile.eof())
    {
        return true;
    }
    else
    {
        return false;
    }
}

main.cpp

main()
{
    int count = 0;
    ReadWords rw("hamlet.txt");
    while(rw.isNextWord()){
        rw.getNextWord();
                count++;
    }
    cout << count;
    rw.close();
}

What it does at the moment is counts the number of characters. I'm sure its just a simple fix and something silly that I'm missing. But I've been trying for long enough to go searching for some help.

Any help is greatly appreciated. :)


Solution

  • Rather than parse the file character-by-character, you can simply use istream::operator<<() to read whitespace-separated words. << returns the stream, which evaluates to true as a bool when the stream can still be read from.

    vector<string> words;
    string word;
    while (wordfile >> word)
        words.push_back(word);
    

    There is a common formulation of this using the <iterator> and <algorithm> utilities, which is more verbose, but can be composed with other iterator algorithms:

    istream_iterator<string> input(wordfile), end;
    copy(input, end, back_inserter(words));
    

    Then you have the number of words and can do with them whatever you like:

    words.size()
    

    If you want to find "Hello" and "James", use find() from the <algorithm> header to get iterators to their positions:

    // Find "Hello" anywhere in 'words'.
    const auto hello = find(words.begin(), words.end(), "Hello");
    
    // Find "James" anywhere after 'hello' in 'words'.
    const auto james = find(hello, words.end(), "James");
    

    If they’re not in the vector, find() will return words.end(); ignoring error checking for the purpose of illustration, you can count the number of words between them by taking their difference, adjusting for the inclusion of "Hello" in the range:

    const auto count = james - (hello + 1);
    

    You can use operator-() here because std::vector::iterator is a “random-access iterator”. More generally, you could use std::distance() from <iterator>:

    const auto count = distance(hello, james) - 1;
    

    Which has the advantage of being more descriptive of what you’re actually doing. Also, for future reference, this kind of code:

    bool f() {
        if (x) {
            return true;
        } else {
            return false;
        }
    }
    

    Can be simplified to just:

    bool f() {
        return x;
    }
    

    Since x is already being converted to bool for the if.