Search code examples
carraysstructfile-iofgets

Reading text file data into an array of structs (while ignoring comment lines)


Overview:

The aim of the following program is to read data line by line from an input file into an array of structs, whilst simultaneously ignoring any comment lines within the input file which begin with the character '#'. The program should then iterate through the array of structs and print the contents, so as to confirm that the program is working as expected.

Here is an example of the input file, where 3 non-comment lines of data are seen. The number of non-comment lines is known before compilation, as seen in the attempt further below by the line int Nbodies = 3.

30 07 6991
# some comment
28 02 4991
09 09 2991

Note: Please note that the following SO questions, among others, have been studied before deciding to post this question:

Reading a text file and ignoring commented lines in C

Ignoring comments when reading in a file

Read a text file ignoring comments

Dilema:

The program can successfully read lines into an array of structs and print the contents when there are no comment lines. The program can also successfully detect when a line begins with a '#' character, thus deeming it a comment line. The issue lies in that even when a comment line is detected, the program still attempts to incorrectly read this line into the array of structs.

This is the expected output:

30 07 6991
28 02 4991
09 09 2991

Here is the actual (and incorrect) output, which seems to ignore the final line of non-commented data:

30 07 6991
-842150451 -842150451 -842150451
28 02 4991

Current Attempt:

fgets has been used to read each line and thus determine if the beginning of the line begins with a '#'. This comment check is performed within an IF statement, that increments the Nbodies variable within the FOR loop condition (so that iteration isn't 'wasted' on a comment line, if that makes sense?). After this, sscanf is used in an attempt to read three values of the current non-comment line into an array of structs. fscanf was also an attempted method which did not work. By using continue; within the IF statement seen in the example, shouldn't sscanf be 'skipped' if a comment line is detected? It doesn't seem to be doing as expected.

Code so far:

#include "stdio.h"

#define EXIT_SUCCESS 0
#define EXIT_FAILURE !EXIT_SUCCESS

int main() {

    typedef struct {
        int a1, b1, c1;
    }DATA;

    FILE *file = fopen("delete.nbody", "r");
    if (file == NULL)
    {
        printf(stderr, "ERROR: file not opened.\n");
        return EXIT_FAILURE;
    }

    int Nbodies = 3;
    int comment_count = 0;
    DATA* data = malloc(Nbodies * sizeof * data); // Dynamic allocation for array
    char line[128]; // Length won't be longer than 128
    int x;
    for (x = 0; x < Nbodies; x++)
    {
        fgets(line, sizeof(line), file);
        if (line[0] == '#')
        {
            comment_count++;
            Nbodies++;// Advance Nbodies so that iteration isn't 'wasted' on a comment line
            continue;
        }

        // QUESTION: doesn't "continue;" within above IF mean that the 
        // following sscanf shouldn't scan the comment line?
        sscanf(line, "%d %d %d", &data[x].a1, &data[x].b1, &data[x].c1);
    }

    // Nbodies - comment_count, because Nbodies advanced
    // every time a comment was detected in the above FOR loop
    for (x = 0; x < Nbodies - comment_count; x++)
    {
        printf("%d %d %d\n", data[x].a1, data[x].b1, data[x].c1);
    }

    return (EXIT_SUCCESS);
}

Question:

Can anyone see why this program wouldn't be working? I would have thought that the continue word would have skipped the sscanf from reading the comment lines when detected. Any help would be greatly appreciated on this.


Solution

  • If you get a comment, do not increment Nbodies.

    It is the wrong approach because data is limited to the original value of Nbodies because of the pre-loop malloc

    And, you'll introduce a gap in the data array.

    Here is a better way, so that data only contains valid data. Comment lines become completely "invisible" (i.e. they do not affect the count of the data):

    int
    main()
    {
    
        typedef struct {
            int a1,
             b1,
             c1;
        } DATA;
    
        FILE *file = fopen("delete.nbody", "r");
    
        if (file == NULL) {
            printf(stderr, "ERROR: file not opened.\n");
            return EXIT_FAILURE;
        }
    
        int Nbodies = 3;
        int comment_count = 0;
    
        // Dynamic allocation for array
        DATA *data = malloc(Nbodies * sizeof *data);
    
        char line[128]; // Length won't be longer than 128
        int x;
    
    #if 0
        for (x = 0; x < Nbodies; x++) {
            fgets(line, sizeof(line), file);
    
            if (line[0] == '#') {
                comment_count++;
                // Advance Nbodies so that iteration isn't 'wasted' on a comment
                Nbodies++;
                continue;
            }
    
            // QUESTION: doesn't "continue;" within above IF mean that the
            // following sscanf shouldn't scan the comment line?
            sscanf(line, "%d %d %d", &data[x].a1, &data[x].b1, &data[x].c1);
        }
    #else
        int inc = 1;
        for (x = 0; x < Nbodies; x += inc) {
            fgets(line, sizeof(line), file);
    
            inc = (line[0] != '#');
    
            if (! inc) {
                comment_count++;
                continue;
            }
    
            // QUESTION: doesn't "continue;" within above IF mean that the
            // following sscanf shouldn't scan the comment line?
            sscanf(line, "%d %d %d", &data[x].a1, &data[x].b1, &data[x].c1);
        }
    #endif
    
        // Nbodies - comment_count, because Nbodies advanced
        // every time a comment was detected in the above FOR loop
    #if 0
        for (x = 0; x < Nbodies - comment_count; x++) {
    #else
        for (x = 0; x < Nbodies; x++) {
    #endif
            printf("%d %d %d\n", data[x].a1, data[x].b1, data[x].c1);
        }
    
        return (EXIT_SUCCESS);
    }