Search code examples
solrsolrcloud

How To intercept Document in Solr


I want to manipulate doc and change the token value for field(s) by prepending some value to each token. I am doing bulk update through DIH and also posting Documents through SOLRJ. I have replication factor as 2, so Replication should also work. The value that I want to prepend is there in the document as a separate field. I am interested to know the place where I can intercept the document before the indexing so that I can manipulate it. One of the option I can think of overriding DirectUpdateHandler2. Is this the right place?

I can do it by externally processing the document and passing it to SOLR But I want to do it inside SOLR.

Document fields are :

  1. city:mumbai
  2. RestaurantName:Talk About
  3. Keywords:Cofee, Chines, South Indian, Bar

I want to index keywords as

  1. mumbai_cofee
  2. mumbai_Chines
  3. mumbai_South Indian
  4. mumbai_Bar

Solution

  • the right place is an Update Request Processor, you make sure you plug that in sorlconfig.xml into all udpate handlers you are using (including DIH), and the single URP will cover all updates.

    In your java code in the URP you can easily get the value of a field and then prepend it to all the others in another field etc. This happens before the doc is indexed.