Search code examples
mongodbmapreducefull-text-searchmorphianosql

Morphia/MongoDB: ordering search results from advanced queries


I'm fairly new to Morphia, MongoDB, and document-oriented databases in general. I'm looking for general guidance on how to approach the following problem.

We have a DB with around 500K Book documents.

{ 
   "isbn" : "0-691-01305-5", 
   "title" : "For Whom the Bell Tolls", 
   "titleFTS" : [
       "bell",
       "toll" ],
   "author" : "Hemingway, Ernest",
   "ratingsCount" : 138, 
   "rating" : "3.5", 
   "sales" : 10245
   "price" : "12.95", 
   "category" : "fiction", 
   "description" : "The story of a young American in the International Brigades attached to a republican guerilla unit during the Spanish Civil War.",
   "descriptionFTS" : [
       "story",
       "young",
       "americ",
       "internat",
       "brigade",
       "attach",
       "republic",
       "guerilla",
       "unit",
       "spanish",
       "civil",
       "war"]
}

We need to perform full-text searches over the title and description fields. To that end, I have created titleFTS and descriptionFTS arrays that contains the words from the title and description fields respectively, filtered of stop words, and then stemmed.

When searching, users enter keywords, and we return the Books that match all of the entered terms, e.g.:

db.Book.find({ titleFTS : { $all: ['spanish', 'civil', 'war']}})
db.Book.find({ descriptionFTS : { $all: ['spanish', 'civil', 'war']}})

This works fine, but but now we come to the tough part: we'd like to order the results from the above queries based on multiple criteria. One such proposed ordering is the following:

  1. books matching search terms in both titleFTS and descriptionFTS fields
  2. books matching in only the titleFTS field
  3. books matching in only the descriptionFTS field
  4. books with greatest # of sales
  5. books with highest rating
  6. books with highest ratingscount

Our app is written in Java and uses the MorphiaDB API. I can envision how to write a Java Comparator for this sort of thing pretty easily, but obviously I'd like to do to the ordering at the DB level.

Which finally brings me to the question: can this be done using the Morphia API? Or do I need to delve into writing Javascript with DB.command()? Does it require Map/Reduce? If so, a hint as to how to implement map/reduce for this problem would help a lot.


Solution

  • I strongly recommend an external fulltext engine for now like Solr or ElasticSearch. The capabilities of MongoDB related to fulltext search are truely not suitable for a real fulltext solution. Your approach with pre-stemming etc. is just a dirty workaround. As long as MongoDB does not provide a suitable fulltext integration, go with an external solution if you are interested in a serious and working solution.