I'm trying to find the occurrences of a given word from an input file, and I'm able to correctly count the occurrences of a letter/character, but when I attempt to find a word the program just returns the count as 0. What am I doing wrong?
ifstream input("input.txt");
input.open("input.txt");
string video = "video", ands = "and";
string str1((istreambuf_iterator<char>(input)),
istreambuf_iterator<char>());
int videocount = 0, sentcount = 0, wordcount = 0, wordcountand = 0, wordcountand2 = 0;
for (int i = 0; i < str1.length(); i++)
{
if (str1 == video) {
++videocount;
}
if (str1[i] == '.') {
sentcount++;
}
if (str1[i] == ' ') {
wordcount++;
}
if (str1 == ands) {
wordcountand++;
}
}
Edit : I just changed the way the file was read and everything worked again.
while (input >> filewords) {
{wordcount++; }
if (filewords == word1) {
++videocount;
}
if (filewords == word2) {
wordcountand++;
}
for (int i = 0; i < filewords.length(); i++) {
if (filewords[i] == '.') {
sentcount++;
}
}
}
Basically, the question has already been answered in the comment. You cannot compare a search string to the complete text file that is stored in your variable "str1". The result will of course always be false.
The equal operator ==
does not look for sub-strings. And this brings us already to the answer, the algorithm that we want to use. We will use std::string.substr
. Please see here for a description of the function. The function parameters are:
So, we need to find the start-position of a word and the end-position of a word. With that, we can count the length of a word which is "end-position" - "start-position".
But how to identify a word? A word usually consists of alpha-numerical characters. And if we iterate through the complete text, and we compare the previous checked character, with the current evaluated character, we can state the following:
And then, something like word = str1.substr(startPosition, endPosition-startPosition);
would give us a single word. This we can compare with our search words, like for example:
if (word == video) ++videocount;
But we can go further. With a very simple standard method, we can store and count all words. For that we can use a std::map
or a std::unordered_map
. We use the std::map
s index operator. Please see here. And especially read the sentence:
Returns a reference to the value that is mapped to a key equivalent to key, performing an insertion if such key does not already exist.
So, it will either create a new entry, or, find an existing entry. In any case, a reference (to the either already existing or the newly created entry) will be returned. And that will be incremented. This can then end up in something like:
wordCounter[text.substr(startIndexOfWord, index - startIndexOfWord)]++
So, here, we first build a sub-string using the already described algorithm. This sub-string is then either found or added to the std::map
. In any case, a reference will be returned, which we will increment.
At the end, we will simply output all words and counters.
In the following proposal I am using C++17 and the features of C++17 like the if
-statment with initializer or structured bindings. So you need to enable C++17 for your compiler.
Please see:
#include <iostream>
#include <fstream>
#include <string>
#include <iterator>
#include <cctype>
#include <vector>
#include <map>
#include <iomanip>
int main() {
// Open the input file and check, if that works
if (std::ifstream ifs("input.txt"); ifs) {
// Read the complete text file into a string variable
std::string text(std::istreambuf_iterator<char>(ifs), {});
// Define the counters
size_t sentenceCounter{};
std::map<std::string, size_t> wordCounter{};
size_t overallWordCounter{};
// And temporary storage of characters from the complete text
char currentCharacter{}; char lastCharacter{};
// Here we stort the index of a word start
size_t startIndexOfWord{};
// Iterate over all characters from the source file
for (size_t index{}; index < text.length(); ++index) {
// Read the current character
const char currentCharacter = text[index];
// Each dot will be counted as an indicator for a sentence
if ('.' == currentCharacter) ++sentenceCounter;
// Now check, if we have found the start of a word. The we will just store the index
if (std::isalnum(currentCharacter) and not std::isalnum(lastCharacter))
startIndexOfWord = index;
// Now, check, if we found the end of a word. Add to map and increment counter
if (std::isalnum(lastCharacter) and not std::isalnum(currentCharacter))
wordCounter[text.substr(startIndexOfWord, index - startIndexOfWord)]++;
// The next lastCharacter is the currentCharacter of now
lastCharacter = currentCharacter;
}
// Go through the complete map
for (const auto& [word, count] : wordCounter) {
// SHow words and counters
std::cout << std::left << "Word: " << std::setw(30) << word << " Count: " << count << "\n";
// Calculate overall sum of words
overallWordCounter += count;
}
// Show final result
std::cout << "\nWords overall: \t" << overallWordCounter << "\nSentences: \t" << sentenceCounter << '\n';
}
else {
std::cerr << "\n***Error: Could not open input file.\n";
}
return 0;
}
Of course there are many many other possible solutions, especially with std::regex
.
If you have any questions, I am happy to answer