Search code examples
searchsolrluceneedismax

Solr - identical search result scores for multiple search terms?


I would like to know how it is possible to get different scores for a multiple terms search result?

Certain results in solr have the same score even when there are multiple terms in the query as you will see in the example below.

I have two indexes in Solr, each containing: id, first_name, last_name Each index would look like the following:

<doc>
    <str name="id">1</str>
    <str name="last_name">fisher</str>
    <str name="name">john</str>
</doc>

<doc>
    <str name="id">2</str>
    <str name="last_name">darby</str>
    <str name="name">john</str>
</doc>

When I query just "john" both results come up. That is perfect. However, when I query "john fisher" both results come up but with the same score. What I want is different scores based on the relevancy of the search terms.

Here is the result for the following query http://localhost:8983/solr/select?q=john+fisher%0D%0A&rows=10&fl=*%2Cscore

<response>
    ...
    <result name="response" numFound="2" start="0" maxScore="0.85029894">
        <doc>
            <float name="score">0.85029894</float>
            <str name="id">1</str>
            <str name="last_name">fisher</str>
            <str name="name">john</str>
        </doc>

        <doc>
        <float name="score">0.85029894</float>
            <str name="id">2</str>
            <str name="last_name">darby</str>
            <str name="name">john</str>
        </doc>
    </result>
</response>

Any help would be greatly appreciated


Solution

  • Your best bet is to understand & analyse how different factors affect your document score, Lucene has helpful feature Explanation, Solr leverage this to provide how scoring is calculated you can use 'debugQuery' in Solr to see how it is derived,

    ?q=john&fl=score,*&rows=2&debugQuery=on
    

    Ex Response:

    <lst name="debug">
        <str name="rawquerystring">john</str>
        <str name="querystring">john</str>
        <str name="parsedquery">+DisjunctionMaxQuery((text:john))</str>
        <str name="parsedquery_toString">+(text:john)</str>
        <lst name="explain">
            <!-- Score calulation for Result#1 -->
            <str>
                2.1536596 = (MATCH) fieldWeight(text:john in 36722), product of:
                1.0 = tf(termFreq(text:john)=1)
                8.614638 = idf(docFreq=7591, maxDocs=15393998)
                0.25 = fieldNorm(field=text, doc=36722)
            </str>
            <!-- Score calulation for Result#2 -->
            <str>
                2.1536596 = (MATCH) fieldWeight(text:john in 36724), product of:
                1.0 = tf(termFreq(text:john)=1)
                8.614638 = idf(docFreq=7591, maxDocs=15393998)
                0.25 = fieldNorm(field=text, doc=36724)
            </str>
        </lst>
    

    besides this, you can use explainOther to find out how a certain document did not match the query.

    ?q=john&fl=score,*&rows=2&debugQuery=on&explainOther=on
    

    Do Read: