Search code examples
solrhighlighting

Solr highlight fragment size set to unlimited, but still truncating a large document?


I've loaded approximately 15k documents into Solr of various sizes. The largest that I have measured is 59,455 characters of plain text. When I execute a query with highlighting and an unlimited fragment size, this large document is truncated to 51,253 (this includes my pre and post tags).

Here is the URL for the query:

http://solr.nowhere.org:8080/solr/select?fl=*,score&sort=score%20desc&hl=true&hl.fragsize=-1&hl.fl=note&hl.simple.pre=<hit>&hl.simple.post=</hit>&hl.q=corn&q=corn

Why is Solr still truncating?

I'm using Solr 4.0.


Solution

  • You also need to bump up the value for hl.maxAnalyzedChars as this value is also limiting the highlighting result.

    How many characters into a document to look for suitable snippets. This parameter makes sense for the original Highlighter only.

    The default value is "51200".

    You can assign a large value to this parameter and use hl.fragsize=0 to return highlighting in large fields that have size greater than 51200 characters.

    So based on this change to hl.fragsize=0 and add a value larger than your longest document to the hl.max.analyzedChars parameter.