Search code examples
ceofeol

Reading files with the same extension in a directory and count their lines


I am having this problem with my code. I've been trying to open files that have the same extension and read the number of lines in the file that is in the directory. So, here is what I've done:

    #include <stdio.h>
#include <stdlib.h>
#include <dirent.h>
#include <math.h>
#include <string.h>
#include <ctype.h>
int countLines(char name[]);
int main()
{
    struct dirent *de;
    DIR *dr=opendir(".");
    char check[16]=".nkt";
    int i;
    char name[64];
    int count=0;

    if(dr==NULL)
    {
        printf("Didn't open!");
        return 0;
    }

    while((de=readdir(dr))!=NULL)
    {
        if((strstr(de->d_name, check))!=NULL)
        {
            strcpy(name, de->d_name);
            countLines(name);
        }
    }

    closedir(dr);

    return 0;
}

int countLines(char name[])
{
    FILE *fp;
    fp=fopen(name,"r");
    char ch;
    int lines=0;
    while(!feof(fp))
    {
        ch=fgetc(fp);
        if(ch=='\n')
        {
            lines++;
        }
    }

    fclose(fp);

    printf("%d\n", lines);
}

and the result that I am getting is always like :

2
2
2

Even though every file has 54 lines. Would gladly appreciate some help. PS. The extension is .nkt


Solution

  • The countLines() function you show is stepping into several traps.

    1. fgetc() returns int not char by intention. It does this to be able to return the End-of-File state, aside all other possible character values. A simple char cannot do this.

    2. The use of feof() to identify the End-of-File fails as the EOF indicator is set only after the last read hitting the end of the file has been completed. So a loop steered using feof() typically iterated one time to often.

      A detailed discussion on this is here.

    3. A text file's last line not necessarily carries an End-of-File indicator, but you mostly likely still want count that line. Special logic needs to be applied to cover this case.

    A possible implementation of a function taking care off all those issue mentioned above might look like this:

    #include <stdio.h>
    
    /* Returns the number of lines inside the file named file_name 
       or -1 on error. */
    long count_lines(const char * file_name)
    {
      long lines = 0;
      FILE * fp = fopen(file_name, "r"); /* Open file to read in text mode. */
      if (NULL == fp)
      {
        lines = -1;
      }
      else
      {
        int previous = EOF;
    
        for (int current; (EOF != (current = fgetc(fp)));)
        {
          if ('\n' == current)
          {
            ++lines;
          }
    
          previous = current;
        }
    
        if (ferror(fp)) /* fgetc() returns EOF as well if an error occurred.
                           This call identifies that case. */
        {
          lines = -1;
        }
        else if (EOF != previous && '\n' != previous)
        {
          ++lines; /* Last line missed trailing new-line! */
        }
    
        fclose(fp);
      }
    
      return lines;
    }
    

    Regarding the discussion about different End-of-Line indicators inside the question's comment section:

    The End-of-Line indicator for text files is implemented differently on different platforms (UNIX: '\n' vs. Windows: \r\n vs. ... (https://en.wikipedia.org/wiki/Newline)).

    To manoeuvre around this the C library function fopen() by default opens a file in so called "text-mode". If opened this way the C implementation takes care that each line's end is returned as a single '\n' character, the so called "new-line" character. Please note (as mentioned above under 3.) that for the last line there might be no End-of-Line indicator at all.