Search code examples
ceofwc

c count lines in a file avoid EOF


On numerous sources, you can find a simple C program to count the number of lines in a file. I'm using one of these.

#include <stdio.h>


int main(int argc, char* argv[]) {   
    FILE *file;
    long count_lines = 0;
    char chr;
 
    file = fopen(argv[1], "r");
    while ((chr = fgetc(file)) != EOF)
    {
        count_lines += chr == '\n';
    }
    fclose(file); //close file.
    printf("%ld %s\n", count_lines, argv[1]);
    return 0;
}

However, it fails to count the num. of lines in Top2Billion-probable-v2.txt. It stops on the line

<F0><EE><E7><E0><EB><E8><FF>

and outputs

1367044 Top2Billion-probable-v2.txt

when it should output 1973218846 lines. wc -l somehow avoids the problem (and is amazingly faster).

Should I give up with a correct C implementation of counting the number of lines of a file or how should I space the special characters as wc does?


Solution

  • fgetc() returns the character read as an unsigned char cast to an int or EOF. Hence declaring chr as int instead of char should solve the issue.