I'm fairly new to Morphia, MongoDB, and document-oriented databases in general. I'm looking for general guidance on how to approach the following problem.
We have a DB with around 500K Book
documents.
{
"isbn" : "0-691-01305-5",
"title" : "For Whom the Bell Tolls",
"titleFTS" : [
"bell",
"toll" ],
"author" : "Hemingway, Ernest",
"ratingsCount" : 138,
"rating" : "3.5",
"sales" : 10245
"price" : "12.95",
"category" : "fiction",
"description" : "The story of a young American in the International Brigades attached to a republican guerilla unit during the Spanish Civil War.",
"descriptionFTS" : [
"story",
"young",
"americ",
"internat",
"brigade",
"attach",
"republic",
"guerilla",
"unit",
"spanish",
"civil",
"war"]
}
We need to perform full-text searches over the title and description fields. To that end, I have created titleFTS
and descriptionFTS
arrays that contains the words from the title
and description
fields respectively, filtered of stop words, and then stemmed.
When searching, users enter keywords, and we return the Books that match all of the entered terms, e.g.:
db.Book.find({ titleFTS : { $all: ['spanish', 'civil', 'war']}})
db.Book.find({ descriptionFTS : { $all: ['spanish', 'civil', 'war']}})
This works fine, but but now we come to the tough part: we'd like to order the results from the above queries based on multiple criteria. One such proposed ordering is the following:
titleFTS
and descriptionFTS
fieldstitleFTS
fielddescriptionFTS
fieldsales
rating
ratingscount
Our app is written in Java and uses the MorphiaDB API. I can envision how to write a Java Comparator for this sort of thing pretty easily, but obviously I'd like to do to the ordering at the DB level.
Which finally brings me to the question: can this be done using the Morphia API? Or do I need to delve into writing Javascript with DB.command()? Does it require Map/Reduce? If so, a hint as to how to implement map/reduce for this problem would help a lot.
I strongly recommend an external fulltext engine for now like Solr or ElasticSearch. The capabilities of MongoDB related to fulltext search are truely not suitable for a real fulltext solution. Your approach with pre-stemming etc. is just a dirty workaround. As long as MongoDB does not provide a suitable fulltext integration, go with an external solution if you are interested in a serious and working solution.