Search code examples
cif-statementsubstringc-stringscounting

Word Count in C


I have a word count program in C that excludes special characters and number digits:

int main(){
    char str[100];
    int i, j = 0;
    int len;
    char choice;
    
    do {
        printf("Enter String: ");
        fgets(str, sizeof(str), stdin);

        len = strlen(str);

        for (i = 0; i < len; i++) {
            if((str[i] >= 'A' && str[i] <= 'Z') || 
              (str[i] >= 'a' && str[i] <= 'z')) //checks if its upper or lowercase {
                if (str[i] != ' ' && (str[i + 1] == ' ' || str[i + 1] == '\0'))//checks if the current character is a space and also the end of the string{
                    j++;
                }
            }
        }

        printf("Word count: %d\n", j);

        printf ("Try again?(Y/N): ");
        scanf (" %c", &choice);    
     
        getchar ();//use to clear input buffer
    } while (choice == 'y' || choice == 'Y');

    return 0;  
}

But it doesn't count the first string after a whitespace, and seems to get a bit messy as I go on the loop.

I tried changing the values of the array, the loop, and also tried to ask AI what I've missed, but the suggestions didn't really work.

output:

Enter String: i
Word count: 0
Try again? (Y/N): Y
Enter String: i have 3 Apples
Word count: 2
Try again? (Y/N): Y
Enter String: hello WORLD
Word count: 3
Try again? (Y/N): hello world

Solution

  • For starters the function fgets can append the new line character '\n' to the entered string.

    You should either remove it for example like

    str[strcspn( str, "\n" )] = '\0';
    

    or take into account in your if statements.

    Otherwise as a result you have for example

    Enter String: i
    Word count: 0
    

    though according to your description the word count should be equal to 1.

    Secondly you are not resetting to zero the variable j within the do-while loop.

    This expression

    str[i] != ' '
    

    in the inner if statement always evaluates to logical true. So its presence in the if statement is redundant.

    Also it will be much better to use standard function isalpha instead of comparing characters with symbols. Also the tab character '\t' and also some non-alpha characters can separate words as written in your description of the task but you have ignored that. According to your approach this string "one,two,three" contains one word.