I'm planning to code a search engine in PHP that will allow my company to search for text contained in multiple projects, each project contained in a unique sqlite database file.
Since at some point there will be well over 100 projects (over 100 sqlite databases), I was wondering which (if any) of the following would be a smarter programming choice:
I really don't know how long it will take to do either option, or which is better practice. Most of the database files are <1MB.
Thanks a lot!
First of all: don't forget that unless you're developing some graphical application in eg PHP-GTK, PHP pageloads are stateless. That means that if you chose option 1 you'd need to cache the data somewhere (for example a different DB). I wouldn' keep it in memory anyway.
Also, it depends on what kind of indexes you have set up. 100 text searches can be really fast if the databases have fulltext indexes.
So, looping through the files is an option. There might be some overhead due to having to open 100 different SQLite files. Also you shouldn't forget to close every file after you're done with it to decrease memory usage. You'll need to make sure all the SQlite DB's are properly indexed.
Another possibility is to create a local DB with all the searchable data and extra metadata relating to which sqlite file the original data can be found, and the last timestamp they were checked. Then on every request you can check the last modification timestamps of the sqlite files and copy over any new data in modified SQlite DB's to your local DB, update the timestamp, and search in your local DB. The performance in this case will depend an how often the SQlite files are updated and how mutch data has to be synced, but I believe that in your case it will suffice.