How do I retrieve records from a MySQL table (Innodb) with results sorted according to how many sub-strings a record's two columns might contain.
I'm searching for words in bodies of texts contained within two different columns. So the word might exist and it might have unknown text in front or behind it.
Currently my PHP code creates a query that searches two columns (contains text) with a varying amount of different sub-strings.
See simplified example:
$string = "";
foreach $array_of_words as $word) {
$string = $string . " column_one LIKE '% $word%' OR column_two LIKE '% $word%' OR ";
}
$string = substr ( $string, 0, - 3 );
$query = "SELECT * FROM tbl_maintable WHERE (" . $string . ") LIMIT 5";
This works fairly well even though it is not exactly what I would like, however the biggest issue with the above code is that it's resource intensive (7 seconds to complete a query against a 100k database).
I have taken a look at natural language searches (to a certain extent), but the documentation hasn't given me a clear definitive method I can use, and the MYSQL version (5.5.44) of the MySQL server in question does not have that option for InnoDB. I will however upgrade if that is the only option.
Maybe you could use MySQL's fulltext search and order by relevancy.
You need to alter your table to add fulltext search index.
ALTER TABLE `tbl_maintable` ADD FULLTEXT INDEX `SEARCH`(`columnName`);
And then run a query like this:
SELECT *, MATCH(columnName) AGAINST ('word1 word2 word3 word4' IN BOOLEAN MODE) AS relevance
FROM tbl_maintable WHERE
MATCH (columnName) AGAINST ('word1 word2 word3 word4' IN BOOLEAN MODE)
ORDER BY relevance DESC
There are other methods to achieve this but, without Fulltext indexing, will perform very poorly, as relational databases weren't really built for this kind of work.
Take a look at this answer for a non-fulltext option but will cost in performance: https://stackoverflow.com/a/20320723/933633