Search code examples
c++asciifstream

When using fstream in C++, how can I filter out control and formatting characters?


I have one of those basic assignments where you need to count the numbers of specific types of characters in an input file. There are 3 files involved, the main program (which I will include below), the hopper.txt input text to be analyzed, and the sample output.txt which demonstrates what the output should look like in the command line.

I believe I have everything set but my final numbers arnt turning out correctly. Specifically, my other and total counters are about 200 over. Now I've done some counting with other programs and am pretty sure that the sample output is correct which is why I suspect that I must be counting the hidden characters (and they must be there because the output isn't just a block of text).

I've tried casting each character to an int in order to see what its ascii value is and go off of that range but my IDE (Xcode) says that "comparison of constant with expression of type 'bool' is always true", and the check doesn't seem to catch anything.

Here are the other two files:

hopper.txt

sample output.txt

/***************************************************************
 CSCI 240         Program 4     Summer 2013

 Programmer: 

 Date Due: 7/14/14

 Purpose: This program reads in the characters from a text file.
    While reading them it takes cultivates relivant data about
    the frequency of different ascii characters and shares its
    results.
 ***************************************************************/

#include <iostream>
#include <iomanip>
#include <fstream>
#include <unistd.h>

#define FILENAME "/Users/username/Documents/NIU/240/Assigntment\ 4/hopper.txt"

using namespace std;

bool isVowel(char ch);
bool isConsonant(char ch);

int main()
{
    ifstream inFile;
    inFile.open (FILENAME, ios::in);

    char ch;
    int t_total         = 0;

    int t_vowel         = 0;
    int t_consonant     = 0;
    int t_letter        = 0;

    int t_leftParen     = 0;
    int t_rightParen    = 0;

    int t_singleQuote   = 0;
    int t_doubleQuote   = 0;

    int t_digit         = 0;
    int t_other         = 0;


    //See if we successfully imported the file
    if (inFile.fail())
    {
        cout<< "\nThe file entitled: " << FILENAME << " failed to open.\n";
        return 0;
    }


    do
    {
        //get next letter and print it out
        inFile.get (ch);
        cout << ch;
        //increment total
        t_total++;

        //check if the character is a letter and if so if it is a vowel or consonant
        if(isalpha(ch)){
            t_letter++;
            //we have found a letter

            if(isVowel(ch)) {
                t_vowel++;
                //we have found a vowel
            }
            else if(isConsonant(ch)) {
                t_consonant++;
                //we have found a consonant;
            }
            else {
                cout << "\nYou shouldnt be here...";

            }
        }

        //check if the character is a digit
        else if (isdigit(ch)) {
            t_digit++;
            //we have found a digit
        }
        //filter out formating characters
        else if (!( 32 <= ((int)ch) <= 255)) {
            continue;
        }

        //covers all other cases of askii characters
        else {
            switch(ch) {
                case '(':
                    t_leftParen++;
                    break;
                case ')':
                    t_rightParen++;
                    break;
                case '\'':
                    t_singleQuote++;
                    break;
                case '\"':
                    t_doubleQuote++;
                    break;
                default:
                    t_other++;
                    break;

            }
        }
    } while (inFile);


    //These are really just here for the convience of not changing each value while working on formatting
    int width1 = 25;
    int width2 = 6;

    //print out the totals found in the document
    cout << "\n\nSummary\n";
    cout << fixed << setw(width1) << "\nTotal characters:" << setw(width2) << right << t_total;
    cout << fixed << setw(width1) << "\nVowels:" << setw(width2) << right << t_vowel;
    cout << fixed << setw(width1) << "\nConsonants:" << setw(width2) << right << t_consonant;
    cout << fixed << setw(width1) << "\nLetters:" << setw(width2) << right << t_letter;
    cout << fixed << setw(width1) << "\nDigits:" << setw(width2) << right << t_digit;
    cout << fixed << setw(width1) << "\nLeft parentheses:" << setw(width2) << right << t_leftParen;
    cout << fixed << setw(width1) << "\nRight parentheses:" << setw(width2) << right << t_rightParen;
    cout << fixed << setw(width1) << "\nSingle quotes:" << setw(width2) << right << t_singleQuote;
    cout << fixed << setw(width1) << "\nDouble quotes:" << setw(width2) << right << t_doubleQuote;
    cout << fixed << setw(width1) << "\nOther:" << setw(width2) << right << t_other;



    return 0;
}


/***************************************************************
 Function: isVowel

 Use: Checks if the inputed character is a vowel.

 Arguements: 1. ch: A askii character

 Returns:   true if it is a vowel, false if it is not
 ***************************************************************/
bool isVowel(char ch) {
    //double check we have a letter
    if(isalpha(ch)) {
        //reduce to lower case to reduce the number of cases that must be checked
        ch = tolower(ch);
        if(ch == 'a' || ch == 'e' || ch == 'i' || ch == 'o' || ch == 'u') {
            return true;
        }
        else {
            return false;
        }
    }
    return false;
}

/***************************************************************
 Function: isConsonant

 Use: Checks if the inputed character is a consonant.

 Arguements: 1. ch: A askii character

 Returns:   true if it is a consonant, false if it is not
 ***************************************************************/
bool isConsonant(char ch) {
    //So long as it is a letter, anything that is not a vowel must be a consonant
    if(isalpha(ch)) {
        return !isVowel(ch);
    }
    return false;
}

Solution

  • You can use std::isspace to test if a character is one of :

    • space (0x20, ' ')
    • form feed (0x0c, '\f')
    • line feed (0x0a, '\n')
    • carriage return (0x0d, '\r')
    • horizontal tab (0x09, '\t')
    • vertical tab (0x0b, '\v')

    And ignore those by adding a test in your reading loop :

        else if (std::isspace(ch)) {
             continue; // Do not update counters
        }