In my application (PHP/MySQL/JS), I have a search functionality built in. One of the search criteria contains checkboxes for various options, and as such, some results would be more relevant than others, should they contain more or less of each option.
i.e. Options are A and B, and if I search for both options A and B, Result 1 containing only option A is 50% relevent, while Result 2 containing both options A and B is 100% relevant.
Prior, I'd just be doing simple SQL queries based on form input, but this one's a little harder, since it's not as simple as data LIKE "%query%", but rather, some results are more valuable to some search queries, and some aren't.
I have absolutely no idea where to begin... does anybody have relevant (ha!) reading material to direct me to?
Edit: After mulling it over, I'm thinking something involving an SQL script to get the raw data, followed by many many rounds of parsing is something I'd have to do...
Nothing cacheable, though? :(
have a look at the lucence project it is available in many languages
this is the php port http://framework.zend.com/manual/en/zend.search.lucene.html
it indexes the items to search and returns the relevant weighted search results, eg better then select x from y where name like '%pattern%' style searching