I have a solr 6.6.0 instance running and have indexed some docs - PDF and HTML. Previously I had solr 4 and searching with highlighting results was fine. Unfortunately this (default) behaviour seems to have disappeared in v6. The setup is the default one mentioned by the original solr tutorial. I played around with a lot of GET parameters but cannot geht highlighted content. I appreciate any hint or tipp to get this running. Am I missing some config changes or parameters?
E.g.
http://serv1:8983/solr/gettingstarted/select?wt=json&indent=true&q=betreten&hl=true&hl.method=unified
gives
{
"responseHeader":{
"zkConnected":true,
"status":0,
"QTime":152,
"params":{
"q":"betreten",
"hl":"true",
"indent":"true",
"hl.method":"unified",
"wt":"json"}},
"response":{"numFound":1,"start":0,"maxScore":0.822483,"docs":[
{
"id":"/var/docs/2017/08/22/2319/page-1.html",
"stream_size":[3820],
"x_parsed_by":["org.apache.tika.parser.DefaultParser",
"org.apache.tika.parser.html.HtmlParser"],
"stream_content_type":["text/html"],
"dc_title":["/var/docs/2017/08/22/2319/page-1.html (22.08.2017 23:19)"],
"ocr_system":["tesseract 3.04.01"],
"content_encoding":["UTF-8"],
"content_type_hint":["text/html; charset=utf-8"],
"resourcename":["/var/docs/2017/08/22/2319/page-1.html"],
"title":["/var/docs/2017/08/22/2319/page-1.html (22.08.2017 23:19)"],
"content_type":["application/xhtml+xml; charset=UTF-8"],
"ocr_capabilities":["ocr_page ocr_carea ocr_par ocr_line ocrx_word"],
"_version_":1576604407523442688}]
},
"highlighting":{
"/var/docs/2017/08/22/2319/page-1.html":{
"_text_":[]}}}
Thank you!
Highlighter generally analyze stored text on the fly in order to highlight.
In your schema please check if _text_
is stored or not. If it is managed schema then _text_
may not be stored. Please check following _text_
config in managed-schema or schema.xml
<field name="_text_" type="text_general" multiValued="true" indexed="true" stored="false"/>
stored=false
indicates that contents of _text_
are not stored. If you set stored="true"
then _text_
will be stored and will be available for highlight.
Note: After changing schema.xml or managed-schema files,