Search code examples
mysqlfull-text-searchfull-text-indexingfulltext-index

MySQL full text query across multiple columns


I have an issue that I'm confused about when using a full text search in MySQL. I'm running MySQL 5.1.67. Here is my simplified table:

CREATE TABLE `contact_info` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`first_name` varchar(25) DEFAULT NULL,
`last_name` varchar(35) DEFAULT NULL,
PRIMARY KEY (`id`),
FULLTEXT KEY `idx_ft_first_name_last_name` (`first_name`,`last_name`)
) ENGINE=MyISAM CHARSET=latin1;

I want to query for first name and last name in a table with about 5 million rows. First name and last name are in separate columns as shown in the above table. I want to get all the rows where the first and last name are "John Smith". I don't want all the people that have John as their first name or Smith as their last name. I know I could do something like the following but because users sometimes chose to store the full name in the first_name column or in the last_name column I'm trying to setup a full text search.

SELECT ci.first_name,ci.last_name FROM contact_info ci
WHERE  ci.first_name LIKE 'John%' AND ci.last_name LIKE 'Smith%' 

When I run the query forcing it to use the primary key index then it returns a few hundred records for John Smith, this is what I want but it is slow and takes about 5 seconds. Here is the query:

SELECT ci.first_name,ci.last_name FROM contact_info ci USE INDEX(PRIMARY)
WHERE MATCH(ci.first_name,ci.last_name) AGAINST ('"John Smith"' IN BOOLEAN MODE);

When I run the query with the index the optimizer prefers, it doesn't return anything.

SELECT ci.first_name,ci.last_name FROM contact_info ci USE INDEX(idx_ft_first_name_last_name)
WHERE MATCH(ci.first_name,ci.last_name) AGAINST ('"John Smith"' IN BOOLEAN MODE);

Why isn't it returning anything? Is it not possible to query for "John Smith" across two columns with a full text index like this?


Solution

  • The documentation for FULLTEXT searching says this:

    A phrase that is enclosed within double quote (“"”) characters matches only rows that contain the phrase literally, as it was typed. The full-text engine splits the phrase into words and performs a search in the FULLTEXT index for the words. Nonword characters need not be matched exactly: Phrase searching requires only that matches contain exactly the same words as the phrase and in the same order. For example, "test phrase" matches "test, phrase".

    You've put the phrase "John Smith" into double quotes. But you don't have any single column containing that phrase: it's split across columns.

    Try leaving off the double quotes.

    Try ('John AND Smith' IN BOOLEAN MODE) and see what you get. It may work. Also, FULLTEXT should rank the records where both word match higher than those where only one matches.