Search code examples
c++fstreamsimplification

How can I clean this program efficiently and still detect when the file input is not a letter?


Here is my whole program, I am supposed to calculate the average number of letters in words from an input file called hw4pr11input.txt. I have only been programming for a couple of weeks so I would appreciate simple answers that I could possibly implement with my small amount of knowledge. I do not know what arrays are yet, the chapter I am doing the homework for is on file io.

#include <fstream>
#include <iostream>
#include <cstdlib>
using namespace std;

//function declaration 
void average_letters(ifstream& fin);
//Precondition: there is a input file with text ready to be read
//postcondition: Text from the input file is read then the average length of
//words is calculated and output to the screen

//start main program
int main()
{
ifstream fin;

         fin.open("hw4pr11input.txt");                                               //opening input file
         if (fin.fail())                                                            //checking for input file opening failure
         {
            cout << "Input file open fail";
            exit(1);                                                               //terminating program if check fails
         }
         cout << "File Open\n";
         average_letters(fin);                                                     //calling function to remove spaces
         system("pause");
         return 0;
}
                                                                                   //function definition, uses iostream and fstream
void average_letters(ifstream& fin)
{
char next, last_char = 0;
double letter_count = 0, word_count = 0;
double average = 0;
     while(!(fin.eof()))
     {
         fin.get(next);
         if(!(next == ' ' || next == ',' || next == '.' || next == '/'             
         || next =='(' || next == ')')) 
         {
                   letter_count++;                                                                    
         }

         else
         {   
             if((next == ' ' || next == ',' || next == '.' || next == '/'         
             || next =='(' || next == ')') && (last_char == ' ' || next == ','    
             || next == '.' || next == '/' || next =='(' || next == ')' ))

             {
                     continue;
             }
             else
             {
                     word_count++;
             }
         }
         last_char = next;                  //stores previous value of loop for comparison
     }
     average = letter_count/word_count;
     cout << "The average length of the words in the file is:" << " " <<average;
     cout << endl;
}

I believe this program works to accomplish the assignment, but my main concern is with the part of function average_letters that checks to see if it is a letter or a symbol. I chose this symbol list by looking at the .txt file. I deleted comments because they make copy and pasting here difficult I apologize if that makes my logic more difficult to understand.

Thanks for your help. Go easy on me :).


Solution

  • You can use a std::bitset<255> using the character converted to an unsigned integer and preset only those characters to true that are word characters. In your loop, you just lookup if its a valid word.

    Note, this presupposes a char being 255 bits instead of unicode. You can up-size your bit set accordingly.

    This allows you a very fast check of whether the character is a word character and allows you to define which characters you want to include (if the requirement changes suddenly to include '-', for example.