I have a text file I use to hold an index of files and words (with their frequencies) that appear in them. I need to read the file into memory and store the words so they can be searched. The file is formatted as follows:
<files> 169
0:file0.txt
1:file1.txt
2:file2.txt
3:file3.txt
... etc ...
</files>
<list> word 2
9: 10
1: 2
</list>
<list> word2 4
3: 19
5: 12
0: 2
8: 2
</list>
... etc ...
The problem is that this index file can become extremely large and won't all fit into memory at once. My solution is to only store a handful of them in a HashTable at once and then when I need to get the data for another word, I would kick an old word out and then parse the data for the new word from a file.
How can I efficiently accomplish this in C? I was thinking I would have to do something with fseek and rewinding once I got to certain points.
Thanks,
Mike
It ended up that the best way to do this (for my needs) was to keep a pointer to current location in the file and the use rewind( FILE *f );
when I reached the end.