Search code examples
cscanffgets

Unexpected behavior with fgets()+sscanf(): 1st row wrong and rounding floats


I'm very surprised by this behaviour. I have to be doing something wrong, but I can't find out what it is.

I had a 133*21 table in a .xml file, and converted it to .csv. I didn't lose any info in this excel conversion.

Then, I made a simple program that reads that table to different structs:

typedef struct{
    float xval;
    float yval;
    float zval;
} tTuple_float;

typedef struct{
    int A;
    int B;
    int C;
} tTuple_int;

The program is this:

#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#define MAXCHAR 1000

int main(void) {
    tTuple_float C1[133], C2[133], C3[133], C4[133], C5[133], C6[133];
    tTuple_int ref[133];
    FILE *fp;
    int i=0;
    char row[MAXCHAR];
    fp = fopen("filename.csv","r");
    if (fp==NULL){
        printf("Error opening file\n");
        return 1;
    }
    i=0;
    setbuf(stdout, NULL);
    while (i<133){
        fgets(row, MAXCHAR, fp);
        printf("%s", row);         //To compare with the row printed with the arrays
        sscanf(row, "%d;%d;%d;%f;%f;%f;%f;%f;%f;%f;%f;%f;%f;%f;%f;%f;%f;%f;%f;%f;%f",
            &ref[i].A, &ref[i].B, &ref[i].C,
            &C1[i].xval, &C1[i].yval, &C1[i].zval,
            &C2[i].xval, &C2[i].yval, &C2[i].zval,
            &C3[i].xval, &C3[i].yval, &C3[i].zval,
            &C4[i].xval, &C4[i].yval, &C4[i].zval,
            &C5[i].xval, &C5[i].yval, &C5[i].zval,
            &C6[i].xval, &C6[i].yval, &C6[i].zval);
        printf("%d;%d;%d;%f;%f;%f;%f;%f;%f;%f;%f;%f;%f;%f;%f;%f;%f;%f;%f;%f;%f\n",
            ref[i].A, ref[i].B, ref[i].C,
            C1[i].xval, C1[i].yval, C1[i].zval,
            C2[i].xval, C2[i].yval, C2[i].zval,
            C3[i].xval, C3[i].yval, C3[i].zval,
            C4[i].xval, C4[i].yval, C4[i].zval,
            C5[i].xval, C5[i].yval, C5[i].zval,
            C6[i].xval, C6[i].yval, C6[i].zval);
        i++;
        setbuf(stdout, NULL);
    }
    fclose(fp);
    return 0;
}

I added that printf("%s", row); to compare the string I was getting from fgets() with the values I was saving using sscanf().

Looking at the two first rows: 1st and 2nd row

We can see that:

  1. the sscanf() doesn't work at all in the 1st;
  2. in the second row, the first float 715.973 is converted to 715.973022, instead of 715.973000;
  3. in the second row, the fifth float 619.22 is converted to 619.219971, instead of 619.220000;

So, in some cases it's adding decimals, in other cases it's subtracting decimals. After some digging in Stack Overflow, I understood that floats are inaccurate but, what I don't know is: how can I work around this? Is there any way to truncate the float or what's the best way to round it up to 3 decimals?

Other than that, any off-topic improvement to the code itself is more than welcome.

EDIT: Providing minimal workable example (MWE) as follows

#include <stdio.h>
#include <stdlib.h>
#define MAXCHAR 1000

typedef struct{
    float xval;
    float yval;
    float zval;
} tTuple_float;

typedef struct{
    int A;
    int B;
    int C;
} tTuple_int;

int main(void) {
    tTuple_float C1[3];
    tTuple_int  ref[3];
    FILE *fp;
    int i=0;
    char row[MAXCHAR];
    setbuf(stdout, NULL);
    fp = fopen("three_row.csv","r");
    if (fp==NULL){
        printf("Error opening file\n");
        return 1;
    }
    i=0;
    while (i<3){
        fgets(row, MAXCHAR, fp);
        printf("%s", row);
        sscanf(row, "%d;%d;%d;%f;%f;%f;",&ref[i].A, &ref[i].B, &ref[i].C,
                &C1[i].xval, &C1[i].yval, &C1[i].zval);
        printf("%d;%d;%d;%f;%f;%f\n",ref[i].A, ref[i].B, ref[i].C,
                C1[i].xval, C1[i].yval, C1[i].zval);
        i++;
        setbuf(stdout, NULL);
    }
    fclose(fp);
    return 0;
}

And the three_row.csv:

1;2;3;111.111;222.222;333.333
4;5;6;444.444;555.555;666.666
7;8;9;777.777;888.888;999.999

My console output when I run the MWE:

1;2;3;111.111;222.222;333.333
524294;0;-13376;0.000000;0.000000;0.000000
4;5;6;444.444;555.555;666.666
4;5;6;444.444000;555.554993;666.666016
7;8;9;777.777;888.888;999.999
7;8;9;777.776978;888.888000;999.999023

Solution

  • Like many have pointed in the comment section, the issue was the encryption type of my .csv. UTF-8-BOM has a BOM (byte order marker) that was being read by sscanf(), and messing up the whole first row i.e. first element of all the arrays.

    To solve it, I just had to open the .csv in Notepad++, change encryption to UTF-8 (no BOM) and save it.

    The second problem, regarding float inaccuracy, in John Bollinger words: You can round to fewer decimal digits for display, but you cannot get a float any closer to the target number.