I'm writing a C program to tokenize an input text file and track the frequency of word length, alongside tracking and storing the corresponding words themselves. I have the word count working fine, but can't get my word_tracker array to store the strings correctly:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <math.h>
#define MAX_LENGTH 34
#define MAX_WORDS 750
int main(int argc, char *argv[]){
FILE *fp; //input file
const char *cur; //stores current word as string literal
char words[MAX_LENGTH*MAX_WORDS]; //stores all words from text file
char file_name[100]; //stores file name
int word_count[MAX_LENGTH] = {0}; //array to store frequency of words based on length
const char *word_tracker[MAX_LENGTH][MAX_WORDS]; //array to store string literals of each word, indexed by char count and
int char_count; //current word's char count
printf("Enter a file name: ");
scanf("%s", file_name);
fp = fopen(file_name, "r");
if((fp==NULL)){
printf("Failure: missing or unopenable file");
return -1;
}else{
while(fgets(words, sizeof(words), fp)){
cur= strtok(words, " -.,\b\t\n"); //first word of line
char_count = strlen(cur);
word_count[char_count-1] = word_count[char_count-1]+1; //increment frequency of specific word length
word_tracker[char_count-1][word_count[char_count-1]-1] = cur; //store string into corresponding array index location
/*test printing*/
printf("%d:", char_count-1);
printf("%s ", word_tracker[char_count-1][(word_count[char_count-1])-1]);
while(cur){
cur = strtok(NULL, " -.,\b\t\n"); //next word
if(cur){
char_count = strlen(cur);
word_count[char_count-1] = word_count[char_count-1]+1; //increment frequency of specific word length
word_tracker[char_count-1][word_count[char_count-1]-1] = cur; //store string into corresponding array index location
/*test printing*/
printf("%d:", char_count-1); //test print
printf("%s ", word_tracker[char_count-1][(word_count[char_count-1])-1]); //test print
}
}
}
}
//Testing word_tracker: (This doesn't work)
printf("\n\n%s \n", word_tracker[0][0]);
printf("\n%s \n", word_tracker[1][0]);
printf("%s \n", word_tracker[2][0]);
printf("%s \n", word_tracker[3][0]);
printf("%s \n", word_tracker[4][0]);
printf("%s \n", word_tracker[5][0]);
return 0;
}
The "interior" tests (within the tokenizing loop) work well, the correct string and length are printed. However, the print tests at the end of main print seemingly random strings, relative to what the input text file says they should input. I have three theories on what I am doing wrong:
1) My indexing is wrong
2) My understand of how to populate and use char* arrays is incorrect
3) My tokenizing loop is incorrect (does cur not equal "the isolated string"?)
I've noticed that the tests at the end of main display variants of whatever is written on the final line of the input file, so I think that my tokenizing loop is likely wrong. Any guidance is greatly appreciated, thank you!
Your result array currently is const char *word_tracker[MAX_LENGTH][MAX_WORDS]
, which is a 2D-array of pointers.
You could either (a) use a 1D-array of pointers and allocate memory then for each word found, or (b) use a 2D-array of characters and strcpy
each word at the proper position.
So (a) would look like...
const char *word_tracker[MAX_WORDS];
...
word_tracker[someIndexWithSomeMeaningUpToMAX_WORDS] = strdup(cur);
And (b) would look like
char word_tracker[MAX_WORDS][MAX_LENGTH];
...
strncpy(word_tracker[someIndexWithSomeMeaningUpToMAX_WORDS], cur, MAX_LENGTH);
word_tracker[someIndexWithSomeMeaningUpToMAX_WORDS][MAX_LENGTH-1] = '\0'
Note that in (b), MAX_LENGTH
indicates the maximum length of a string (i.e. a single word) and is therefore the second index. strncpy
makes sure not to exceed the size reserved for a word.