Search code examples
solrweb-crawlerhighlightnutch

Solr not returning highlighted results


I am using nutch 1.15 and solr 7.3, and I followed search highlight as per doc - https://lucene.apache.org/solr/guide/7_3/highlighting.html

For me, normal query for nutch solr search is working and it is returning results: curl http://localhost:8983/solr/nutch/select?q=content:build&wt=json&rows=10000&start=0

With search highlight query I am getting same results but getting a warning.- hl.q=content:build: not found

The query with highlight params are like below - curl http://localhost:8983/solr/nutch/select?q=content:build&hl=on&hl.q=content:build&wt=json&rows=10000&start=0

See the complete response -

$ curl http://localhost:8983/solr/nutch/select?q=content:build&hl=on&hl.q=content:build&wt=json&rows=10000&start=0
-sh: 8: hl.q=content:build: not found
[3]   Done(127)                  hl.q=content:build
[2]   Done                       curl http://localhost:8983/solr/nutch/select?q=content:build
$ {
  "responseHeader":{
    "status":0,
    "QTime":1,
    "params":{
      "q":"content:build"}},
  "response":{"numFound":2,"start":0,"docs":[
      {
        "digest":"ff0d20368525b3a0f14933eddb0809db",
        "boost":1.0151907,
        "id":"https://dummy_url",
        "title":"dummy title",
        "content":"dummy content",
        "_version_":1691148343590256640},
      {
        "digest":"4fd333469ed5d83ad08eaa7ef0b779c4",
        "boost":1.0151907,
        "id":"https://dummy_url1",
        "title":"dummy title1",
        "content":"dummy content1",
        "_version_":1691148343603888128}]
  }}

Anyone have idea on how to resolve this? I am not getting any errors in nutch and solr logs.


Solution

  • You're not running the command you think you're running - & signals to the shell that the command should be run in the background, so what's effectively happening is that you're running multiple commands:

    curl http://localhost:8983/solr/nutch/select?q=content:build
    
    hl=on
    hl.q=content:build 
    wt=json
    rows=10000
    start=0
    

    This is not what you intend to do. You can either wrap your URL within quotes (") or escape the ampersands:

    curl "http://localhost:8983/solr/nutch/select?q=content:build&hl=on&hl.q=content:build&wt=json&rows=10000&start=0"
    
    # or
    
    curl http://localhost:8983/solr/nutch/select? q=content:build\&hl=on\&hl.q=content:build\&wt=json\&rows=10000\&start=0