I've loaded approximately 15k documents into Solr of various sizes. The largest that I have measured is 59,455 characters of plain text. When I execute a query with highlighting and an unlimited fragment size, this large document is truncated to 51,253 (this includes my pre and post tags).
Here is the URL for the query:
http://solr.nowhere.org:8080/solr/select?fl=*,score&sort=score%20desc&hl=true&hl.fragsize=-1&hl.fl=note&hl.simple.pre=<hit>&hl.simple.post=</hit>&hl.q=corn&q=corn
Why is Solr still truncating?
I'm using Solr 4.0.
You also need to bump up the value for hl.maxAnalyzedChars as this value is also limiting the highlighting result.
How many characters into a document to look for suitable snippets. This parameter makes sense for the original Highlighter only.
The default value is "51200".
You can assign a large value to this parameter and use hl.fragsize=0 to return highlighting in large fields that have size greater than 51200 characters.
So based on this change to hl.fragsize=0
and add a value larger than your longest document to the hl.max.analyzedChars
parameter.