I have a MySQL database of recipes. One of the columns, NAME, is a varchar(255). Each NAME can be multiple words (e.g., "Chocolate Chip Cookies" and "Oatmeal Cookies").
I'm looking for a way to efficiently parse the recipe names into individual words, then return a list of the most common words. So, given the two recipes above, the query would return:
cookies, 2 chocolate, 1 chip, 1 oatmeal, 1
Is there an efficient way to do this? I could query all the results, split each recipe name into words, then build a data structure to do this outside the database. Alternatively, I could split the strings up ahead of time and build a table.
I guess what I am really hoping for is to leverage the database to do this faster (like with an index). I don't know if that's possible with a single query.
So, am I stuck building a table ahead of time? Or can this be done efficiently with a single query?
Thanks!
If your table uses the MyISAM storage engine and you build a FULLTEXT
index over your NAME
column, you could use myisam_ftdump's --count
option to obtain an overall count of each word. Sadly, however, this would be outside of SQL.