I try to count the number of characters, words, lines in a file. The txt file is:
The snail moves like a
Hovercraft, held up by a
Rubber cushion of itself,
Sharing its secret
And here is the code,
void count_elements(FILE* fileptr, char* filename, struct fileProps* properties) // counts chars, words and lines
{
fileptr = fopen(filename, "rb");
int chars = 0, words = 0, lines = 0;
char ch;
while ((ch = fgetc(fileptr)) != EOF )
{
if(ch != ' ') chars++;
if (ch == '\n') // check lines
lines++;
if (ch == ' ' || ch == '\t' || ch == '\n' || ch == '\0') // check words
words++;
}
fclose(fileptr);
properties->char_count = chars;
properties->line_count = lines;
properties->word_count = words;
}
But when i print the num of chars, words and lines, outputs are 81, 18, 5 respectively What am i missing? (read mode does not changes anything, i tried "r" as well)
The solution I whipped up gives me the same results as the gedit document statistics:
#include <stdio.h>
void count_elements(char* filename)
{
// This can be a local variable as its not used externally. You do not have to put it into the functions signature.
FILE *fileptr = fopen(filename, "rb");
int chars = 0, words = 0, lines = 0;
int read;
unsigned char last_char = ' '; // Save the last char to see if really a new word was there or multiple spaces
while ((read = fgetc(fileptr)) != EOF) // Read is an int as fgetc returns an int, which is a unsigned char that got casted to int by the function (see manpage for fgetc)
{
unsigned char ch = (char)read; // This cast is safe, as it was already checked for EOF, so its an unsigned char.
if (ch >= 33 && ch <= 126) // only do printable chars without spaces
{
++chars;
}
else if (ch == '\n' || ch == '\t' || ch == '\0' || ch == ' ')
{
// Only if the last character was printable we count it as new word
if (last_char >= 33 && last_char <= 126)
{
++words;
}
if (ch == '\n')
{
++lines;
}
}
last_char = ch;
}
fclose(fileptr);
printf("Chars: %d\n", chars);
printf("Lines: %d\n", lines);
printf("Words: %d\n", words);
}
int main()
{
count_elements("test");
}
Please see the comments in the code for remarks and explanations. The code also would filter out any other special control sequences, like windows CRLF and account only the LF