I have a large .csv data file I am trying to read into a multidimensional array (each value in the .csv file is a string) but I am having some unknown problems with the tokenization. My while loop used for tokenizing each line is as follows
while (!feof (st))
{
fgets (row, CHAR, st);
token = strtok (row, ",");
while (token != NULL)
{
for (int row = 0; row < ROWS; row ++)
for (int col = 0; col < COLS; col ++)
strcpy (data [row][col], token);
token = strtok (NULL, ",");
}
}
What it does is until the end of the file, it gets the next line in the .csv file, detects the comma which separates each value, and then for each value from that line, it places it into the next empty spot in the array. It then repeats the process for each row.
However when I test print the data array, rather than being filled with the data from the .csv file, it is filled with "1" which is the very last value in the .csv file (in the last column of the last row of the file). On top of that, the array is full of empty space and garbage at the bottom. I'm not even sure how that is possible as I declared the exact size of the array from the beginning. (I also made a bit of code at the beginning of the program to clean the .csv file to remove extra rows from the bottom of the program.)
Can anybody guide me on the right direction? I feel like I am very close as when I print the tokenized values rather than attempting to place them into the array, I am able to see all the values correctly printed out.
You're reading a token with ftok()
, then looping through all the array elements, copying the token into each element. So at the end, they all contain the last token. You should just copy each token into a single element.
int rownum = 0;
while (fgets (row, CHAR, st))
{
int colnum = 0;
token = strtok(row, ",\n");
while (token != NULL)
{
strcpy(data [rownum][colnum++], token);
token = strtok (NULL, ",\n");
}
rownum++;
}
Other issues:
!feof()
as the while
condition. Test the result of fgets()
instead. See Why is “while( !feof(file) )” always wrong?fgets()
includes the newline terminator. Either remove this first (see Removing trailing newline character from fgets() input) or include it in the delimiter string in strtok()
so you don't copy it into the array.row
for both the line read from the file and the index into the array. While they don't actually interfere in your code because they're in different scopes, it's confusing to readers. When I refactored the loops to move int row = 0;
out of the loop, they ended up conflicting, so I renamed them to rownum
and colnum
.