I am using Sphinx to provide search to a website and I've run across a bit of a snag when returning relevant results.
To keep my question simple, let's assume that I have two fields, @title and @body, which are weighted 100 & 15 respectively. When I search for small words like the word 'in' I would like to have it rank exact matches for that search term higher and then check for matches to 'in*|*in|*in*' and rank them slightly lower. Is there any way to have this type of specificity for your searches?
Example results for 'in':
Some relevant settings are:
In sphinx.conf:
morphology = stem_en
charset_type = utf-8
min_word_len = 2
min_prefix_len = 0
min_infix_len = 2
enable_star = 1
In search.php
$sp->SetMatchMode( SPH_MATCH_EXTENDED2 );
$sp->SetRankingMode( SPH_RANK_PROXIMITY_BM25 );
$sp->SetFieldWeights ( array('title' => 100, 'body' => 15) );
Also, as a side note: I've also had some instances where partial matches don't even show up in the search results. For example, I have searched for Cow but Cowboy does not show up as a result. I have also searched for Cowb and Cowbo and it wasn't until I typed Cowboy that I received the expected result. Any thoughts?
This question is along the same lines as this previous SO question, but I hope I've given a little more detail as to my problem and the things I've tried to warrant a solution.
Looks like morphologically Cow not related to Cowboy.
You could solve it in two ways:
Regard different ranking for "in" and "in" I could suggest to have two body fields in index, lets say: body and body_star with the same content from body field.
in search.php
$sp->SetRankingMode( SPH_RANK_PROXIMITY_BM25 );
$sp->SetMatchingMode( SPH_MATCH_EXTENDED2 );
$sp->SetFieldWeights ( array('title' => 20, 'body' => 15, 'body_start' => 5) );
$sp->Query("@body in @body_star *in* @title in");
This should do the trick.