Search code examples
cfopen

Reading a comma-seperated list of numbers fails C


I have a file containing a list of numbers separated by commas. I tried different methods of reading data, and this piece of code has worked without issues on different datasets.

Input for example (600 values): https://pastebin.com/AHJ5UpEu

#include <stdio.h>
#include <stdint.h>
#include <malloc.h>
#include <mem.h>

#define READ "r"
#define MAX_LINE_SIZE 4096
#define DATA_DELIMITER ","


unsigned char *readInput(const char *filename, size_t inputs) {
    unsigned char *input = malloc(sizeof(unsigned char) * inputs);
    unsigned char nbr;
    const char *token;
    int i;

    FILE *inputPtr = fopen(filename, READ);
    char line[MAX_LINE_SIZE];

    while (fgets(line, MAX_LINE_SIZE, inputPtr)) {
        nbr = 0;
        for (token = strtok(line, DATA_DELIMITER); token && *token; token = strtok(NULL, ",\n")) {
            input[nbr] = (unsigned char) atoi(token);
            nbr++;
        }
        break;
    }

    fclose(inputPtr);

    if(nbr != inputs){
        printf("Error, did not read all files. Only read %d\n",nbr);
        exit(-1);
    }
    exit(0);
}

int main() {


    unsigned char *d = readInput("../traces/inputs.dat", 600);
    free(d);
    exit(0);
}

Though it only reads the first 88 values. If I change the max-line-size to for example 512, this number is 145. Though the value should - if I understand this correct - be equal to the length of the line, in my case ~2100 characters. So using 4098 shouldn't be an issue.

Please do correct me if I'm wrong.

How come I'm not reading all 600 values, but only parts of the data?


Solution

  • nbr is being used like an integer counter but is defined as an unsigned char. A char is one byte, and an unsigned byte has a range of 0 to 255. Incrementing beyond 255 will cause the byte to overflow and return to a value of 0. So, currently, nbr is actually the total number of entries processed mod 256.