I found some answers on StackOverflow, but nothing fits exactly my needs.
I am writing a Ruby script to find rows by a specific key into large CSV files (~500MB and 1M records each file).
The grep
command is taking from 15-30 minutes find a match in 1 file.
I have 400+ files, and I have to run dozens of searches daily.
I need a simple, flexible and affordable solution to search in files.
I finally spent 1 day of work and developed this solution: CSV-Indexer.
CSV-Indexer is not as robust as Lucene, but it is simple and cost-effective. May index files with millions of rows and find specific rows in matter of seconds.
Find full documentation and examples here: