I have program that loads a (sometimes) large CSV file into an array. The data cannot be sorted, and I do not know if the data is text or numbers. This is up to customers.
Example could be
1;JOHN;DOE
2;JANE;DOE;
3;BOBBY;NOTABLES
but it could also be strings
MB9384HJ;TEST1
B9284918;TEST2
The number of lines could be up to a few million.
I would like to seach for a specific value in a column (which one is known ahead of time, this is my "key index column"). Assume this is unique. Key is to find which row this column is in.
Currently the code is traversing from 1..n and comparing. This is obviously slower towards the end.
I am considering these options:
My idea is: instead of traversing the array, I query the index for the key (client provides item to search for, it must be random-access). Then I immediately get the rownumber of the array, and I can fetch the data.
Which of the these (or other, if any) would be a better path to follow ?
SQLite is probably too much if you just want to search for the key. It would be interesting if you fill a SQLite table with the CSV and have to do complex queries not only on the keys but also the other columns.
A Hashed string list is probably the faster but there is a problem with hash collisions.
A Dictionary is probably the best solution in your specific case. And it is easy since Delphi RTL provide the required generic class.