Search code examples
solrhighlighting

Solr highlighting with unexpected prefix and suffix


I need to customize Solr highlighting prefix and suffix like this:

<span class="highlight">text</span>

instead of the default

<em>text</em>

That's why I'm using this configuration within the solrconfig.xml for the HighlightComponent:

<searchComponent class="solr.HighlightComponent" name="highlight">
    <highlighting>
        <fragmentsBuilder name="simple" default="true" class="solr.highlight.SimpleFragmentsBuilder">
            <lst name="defaults">
                <str name="hl.tag.pre"><![CDATA[<span class="highlight">]]></str>
                <str name="hl.tag.post"><![CDATA[</span>]]></str>
            </lst>
        </fragmentsBuilder>
    </highlighting>
</searchComponent>

The following are the default parameters for my standard request handler:

<requestHandler name="standard" class="solr.SearchHandler" default="true">
    <lst name="defaults">
        <str name="hl">true</str>
        <str name="hl.fl">body,title</str>
        <str name="hl.useFastVectorHighlighter">true</str>
    </lst>
</requestHandler>

When I search for the text word I do get the text word highlighted, but not always using the prefix and suffix I configured:

<lst name="highlighting">
    <lst name="document_1">
        <arr name="body">
            <str>my <em>text</em> highlighted</str>
        </arr>
        <arr name="title">
            <str>my <span class="highlight">text</span> highlighted</str>
        </arr>
    </lst>
</lst>

Does anybody know why?


Solution

  • I am guessing you are seeing this behavior behavior because you only have the prefix and suffix defined for the SimpleFragmentsBuilder and the other highlights are coming from another fragment builder.

    I am using a custom prefix and suffix for my highlighting and I set this value in the formatter section of the highlighting section of the solrconfig.xml and have not had any issues as it will apply to all fragment builders.

    So maybe try the following:

     <highlighting>
       <fragmentsBuilder name="simple" default="true"
              class="solr.highlight.SimpleFragmentsBuilder"/>
       <!-- Configure the standard formatter -->
       <formatter name="html" class="org.apache.solr.highlight.HtmlFormatter"
            default="true">
         <lst name="defaults">
           <str name="hl.simple.pre"><![CDATA[<span class="highlight">]]></str>
           <str name="hl.simple.post"><![CDATA[</span>]]></str>
         </lst>
      </formatter>
     </highlighting>