Search code examples
mysqloptimizationsearch-engine

MySQL query - weight-based search engine


Here is the database structure for a medical MIS table:

CREATE TABLE IF NOT EXISTS `ab_med` (
  `med_id` bigint(20) NOT NULL AUTO_INCREMENT,
  `med_title` varchar(200) DEFAULT NULL,
  `med_posted_date` varchar(200) DEFAULT NULL,
  `med_company_name` varchar(200) DEFAULT NULL,
  `med_experience` varchar(200) DEFAULT NULL,
  `med_salary` varchar(200) DEFAULT NULL,
  `med_city` varchar(200) DEFAULT NULL,
  `med_description_short` text,
  `med_details_link` varchar(500) DEFAULT NULL,
  `med_from_city_type` int(1) NOT NULL DEFAULT '1',
  `med_from_city` varchar(255) DEFAULT NULL,
  `med_collected_time` datetime DEFAULT NULL,
  `med_status` int(1) NOT NULL,
  PRIMARY KEY (`med_id`),
  KEY `MedCity` (`med_city`),
  KEY `MedTitle` (`med_title`),
  KEY `MedCompany` (`med_company_name`),
  KEY `MedExperience` (`med_experience`),
  KEY `med_details_link` (`med_details_link`),
  FULLTEXT KEY `med_index` (`med_title`,`med_company_name`,`med_description_short`)
) ENGINE=MyISAM  DEFAULT CHARSET=latin1 AUTO_INCREMENT=639295 ;

There are around 0.5 million records. A user can search using 2 fields:

  1. Keyword
  2. Location

Keyword is looked into med_title, med_company_name, med_description_short, med_city as per the score.

Location is looked into med_city

I've used full text mysql search but I need the most recent entries to show up with better score. I'm not getting the SQL query for it. I'm using the following SQL query:

SELECT distinct(med_details_link),med_id,med_title,med_posted_date,med_company_name,med_description_short,med_from_city,med_experience,med_city,med_collected_time,MATCH (med_title,med_company_name,med_description_short) AGAINST ('+hello +world' IN BOOLEAN MODE) as score
FROM ab_med
WHERE MATCH (med_title,med_company_name,med_description_short) AGAINST ('+hello +world' IN BOOLEAN MODE) order by med_id desc limit 1000

Suppose I search for "hello world", I understand that there should be weight based on:

  1. title would have highest weight
  2. company name will have weight less than title but more than description
  3. description will have least weight

Please help in getting a SQL query to consider all these weights and search efficiently.


Solution

  • Consider turning to Sphinx to get a satisfactory solution to this problem.

    With 5 million rows, and the requirements you have, the performance of MySQL fulltext search will be disappointing.