Search code examples
elasticsearchelasticsearch-pluginelasticsearch-5

how to store nested fields in elasticsearch


I am using Java High Level REST client. Here is the link to its documentation

I have created a client.

trait HighLevelRestClient {

  def elasticSearchClient(): RestHighLevelClient = {
   new RestHighLevelClient(
      RestClient.builder(
        new HttpHost("localhost", ElasticSearchPort, "http")))
  }
}

While indexing the data, the nested fields are being stored as String. The following code explains how the index is being created:

val indexRequest = new IndexRequest("my-index", "test-type").source(
  "person", person,
  "date", DateTime.now()
)

where, person is a case class, represented as:

Person(personId: String, name: String, address: Address) 

and Address is itself a case class, represented as:

Address(city: String, zip: Int)

My application requires person to be stored as key-value pair, so that it's fields are searchable. But, when I am using the above code, it is being stored as String.

{
"person" : "Person(my-id, my-name, Address(my-city, zip-value))",
"date" :  "2017-12-12"
} 

and required structure is:

{
"person" : {
    "personId" : "my-id",
    "name" : "person-name",
    "address": {
      "city" : "city-name",
      "zip" : 12345
          }
     },
"date" : "2017-12-12"
}

I hope I have framed the question well. Any help would be appreciated. Thanks!


Solution

  • You are almost there. To achieve your goal you need to:

    1. Serialize the object to JSON on your side
    2. Specify the content type of the request

    It is actually described in the page of the Index API.

    A convenient library to serialize case classes into JSON is for example json4s (you can see some examples of serialization here).

    Your code might look like the following:

    import org.apache.http.HttpHost
    import org.elasticsearch.action.index.IndexRequest
    import org.elasticsearch.client.{RestClient, RestHighLevelClient}
    import org.elasticsearch.common.xcontent.XContentType
    import org.joda.time.DateTime
    import org.json4s.NoTypeHints
    import org.json4s.jackson.Serialization
    import org.json4s.jackson.Serialization.write
    
    case class Address(city: String, zip: Int)
    
    case class Person(personId: String, name: String, address: Address)
    
    case class Doc(person: Person, date: String)
    
    object HighClient {
      def main(args: Array[String]): Unit = {
        val client = new RestHighLevelClient(
          RestClient.builder(
            new HttpHost("localhost", 9206, "http")))
    
        implicit val formats = Serialization.formats(NoTypeHints)
    
        val doc = Doc(
          Person("blah1", "Peter Parker", Address("New-York", 33755)),
          DateTime.now().toString
        )
    
        val indexRequest = new IndexRequest("my-index", "test-type").source(
          write(doc), XContentType.JSON
        )
    
        client.index(indexRequest)
    
        client.close()
      }
    }
    

    Note that in this case:

    new IndexRequest("my-index", "test-type").source(
      write(doc), XContentType.JSON
    )
    

    this function will be used: public IndexRequest source(String source, XContentType xContentType)

    While in your case:

    new IndexRequest("my-index", "test-type").source(
      "person", person,
      "date", DateTime.now()
    )
    

    it will call public IndexRequest source(Object... source).

    Hope that helps!