I'm writing a wordcount function that should be able to read elements from stdin into a string. Then evaluate the string and return the number of words, number of lines, size of the string, and the number of unique words.
My issue is when it comes to adding words to the unique set. When I write it to add elements to a set, it would count the whitespace as part of the word then push entirely into my set. Example: Input:
this is
is
a test
test
Output
a
test
is test this
line is 4
Words = 7
size is 27
Unique is 6
It counts 7 words in total and 6 unique. I tried debugging it by printing bits of the code as i go so i can keep track of where I went wrong. I can only conclude that the issue lies within my if loops. How can I get past this, I've been stuck for some time now.
Here is my code:
#include<iostream>
#include<string>
#include<set>
using std::string;
using std::set;
using std::cin;
using std::cout;
set<string> UNIQUE;
size_t sfind(const string s) //will take string a count words, add to set
{
string a;
int linecount = 0;
int state = 0; //0 represents reading whitespace/tab, 1 = reading letter
int count = 0; //word count
for(size_t i =0; i < s.length(); i++) {
a+=s[i]; //add to new string to add to set
if(state ==0) { //start at whitespace
if(state != ' ' && state != '\t') { //we didnt read whitespace
count++;
state =1;
}
}
else if(s[i]== ' ' || s[i] == '\t' || s[i] == '\n') {
state = 0;
UNIQUE.insert(a); //add to UNIQUE words
a.clear(); // clear and reset the string
}
if (s[i] == '\n') {
linecount++;
}
}
for(set<string>::iterator i = UNIQUE.begin(); i!= UNIQUE.end(); i++) {
cout << *i;
}
cout << '\n';
cout << "line is " << linecount << '\n';
return count;
}
int main()
{
char c;
string s;
while(fread(&c,1,1,stdin)) {
s+=c; //read element add to string
}
cout << "Words = " << sfind(s) << '\n';
cout << "size is " << s.length() << '\n';
cout << "Unique is "<< UNIQUE.size() << '\n';
return 0;
}
Also I will be using
fread(&c,1,1,stdin)
because i will be using it later on with a larger wordcount function.
Rather than writing code trying to parse the string on spaces, use std::istringstream to do the parsing.
Here is an example:
#include <string>
#include <iostream>
#include <sstream>
#include <set>
int main()
{
std::set<std::string> stringSet;
std::string line;
while (std::getline(std::cin, line))
{
std::istringstream oneline(line);
std::string word;
while (oneline >> word)
{
std::cout << word << "\n";
stringSet.insert(word);
}
}
std::cout << "\n\nThere are " << stringSet.size() << " unique words";
}