Search code examples
cfileprintfeoffgetc

Why does this code print null characters and why does it find null characters in both files?


bool isIdentical(FILE *file1, FILE *file2) //this fun tell me files is identical or no
{
    // this must have char1 in file1?  
    char c1 = fgetc(file1);

    // this must have char2 in file2? 
    char c2 = fgetc(file2);

    while (c1 != EOF && c2 != EOF)  //check if files reached to end 
    {
        c1 = fgetc(file1);
        c2 = fgetc(file2);
        if (c1 != c2)
            return 0;

        // I expect to print each char in to files but it actual print NULL  
        printf("%c %c\n", c1, c2);
    }

    // this to check if two files reached to end without returning 0
    if (c1 == EOF && c2 == EOF)
        return 1;
    else
        return 0; // then the files don't have the same length hence are not identical
}

Solution

  • There are multiple problems:

    • You must define c1 and c2 with type int instead of char to reliably test for EOF. fgetc() has 257 different return values, char can only hold 256 different values.

    • Furthermode, the printf statement is never called for the first byte of both files and may be called with both c1 and c2 equal to EOF, which is not 0, but more likely -1. You might get funny characters on your terminal for these bytes, which you seem to interpret as null characters, but are either non-ASCII characters (eg: ÿ) or encoding errors depending on the terminal settings.

    • The printf call outputs the raw byte values, which may cause spurious behavior in the terminal if the files have binary contents.

    • Also the files must be open in binary mode (with "rb") to ensure fgetc() does not translate the file contents in a system specific way on legacy systems.

    Here is a modified version:

    #include <stdbool.h>
    #include <stdio.h>
    
    bool isIdentical(FILE *file1, FILE *file2)  // compare file contents
    {
        int c1 = fgetc(file1);
        int c2 = fgetc(file2);
    
        // loop until one or both files reach end of file
        while (c1 != EOF && c2 != EOF)
        {
            if (c1 != c2)
                return 0;
    
            printf("%d %d\n", c1, c2); // output the byte values
    
            c1 = fgetc(file1);
            c2 = fgetc(file2);
        }
    
        // return TRUE if both files have the same length
        return (c1 == EOF && c2 == EOF);
    }
    

    Here is a simpler alternative with fewer tests and without duplicate function calls:

    #include <stdbool.h>
    #include <stdio.h>
    
    bool isIdentical(FILE *file1, FILE *file2)  // compare file contents
    {
        for (;;)   // unconditional loop
        {
            int c1 = fgetc(file1);
            int c2 = fgetc(file2);
    
            if (c1 != c2)
                return 0;   // different file contents or lengths
    
            if (c1 == EOF)
                return 1;   // both files reached end of file
    
            printf("%d %d\n", c1, c2); // output identical byte values
        }
    }