Search code examples
spring-bootelasticsearchspring-data-elasticsearchresthighlevelclient

Spring Boot, query Elasticsearch specific fields from already indexed data created by Elastic Stack


The target is to query specific fields from an index via a spring boot app.

Questions in the end.

The data in elasticsearch are created from Elastic Stack with Beats and Logstash etc. There is some inconsistency, eg some fields may be missing on some hits.

The spring app does not add the data and has no control on the fields and indexes

The query I need, with _source brings

GET index-2022.07.27/_search
{
  "from": 0,
  "size": 100,
  "_source": ["@timestamp","message", "agent.id"],
  "query": {
      "match_all": {}
  }
}

brings the hits as

  {
    "_index": "index-2022.07.27",
    "_id": "C1zzPoIBgxar5OgxR-cs",
    "_score": 1,
    "_ignored": [
      "event.original.keyword"
    ],
    "_source": {
      "agent": {
        "id": "ddece977-9fbb-4f63-896c-d3cf5708f846"
      },
      "@timestamp": "2022-07-27T09:18:27.465Z",
      "message": """a message"""
    }
  },

and with fields instead of _source is

{
    "_index": "index-2022.07.27",
    "_id": "C1zzPoIBgxar5OgxR-cs",
    "_score": 1,
    "_ignored": [
      "event.original.keyword"
    ],
    "fields": {
      "@timestamp": [
        "2022-07-27T09:18:27.465Z"
      ],
      "agent.id": [
        "ddece977-9fbb-4f63-896c-d3cf5708f846"
      ],
      "message": [
        """a message"""
      ]
    }
},
  1. How can I get this query with Spring Boot ?

I lean on StringQuery with the RestHighLevelClient as below but cant get it to work

        Query searchQuery = new StringQuery("{\"_source\":[\"@timestamp\",\"message\",\"agent.id\"],\"query\":{\"match_all\":{}}}");

        SearchHits<Items> productHits = elasticsearchOperations.search(
                searchQuery,
                Items.class,
                IndexCoordinates.of(CURRENT_INDEX));
  1. What form must Items.class have? What fields?

I just need timestamp, message, agent.id. The later is optional, it may not exist.

  1. How will the mapping work?

versions:

  • Elastic: 8.3.2
  • Spring boot: 2.6.6
  • elastic (mvn): 7.15.2
  • spring-data-elasticsearch (mvn): 4.3.3

official documentation states that with RestHighLevelClient the versions should be supported

Support for upcoming versions of Elasticsearch is being tracked and general compatibility should be given assuming the usage of the high-level REST client.


Solution

  • You can define an entity class for the data you want to read (note I have a nested class for the agent):

    @Document(indexName = "index-so", createIndex = false)
    public class SO {
        @Id
        private String id;
    
        @Field(name = "@timestamp", type = FieldType.Date, format = DateFormat.date_time)
        private Instant timestamp;
    
        @Field(type = FieldType.Object)
        private Agent agent;
    
        @Field(type = FieldType.Text)
        private String message;
    
        public String getId() {
            return id;
        }
    
        public void setId(String id) {
            this.id = id;
        }
    
        public Instant getTimestamp() {
            return timestamp;
        }
    
        public void setTimestamp(Instant timestamp) {
            this.timestamp = timestamp;
        }
    
        public Agent getAgent() {
            return agent;
        }
    
        public void setAgent(Agent agent) {
            this.agent = agent;
        }
    
        public String getMessage() {
            return message;
        }
    
        public void setMessage(String message) {
            this.message = message;
        }
    
        class Agent {
            @Field(name = "id", type = FieldType.Keyword)
            private String id;
    
            public String getId() {
                return id;
            }
    
            public void setId(String id) {
                this.id = id;
            }
        }
    }
    

    The query then would be:

    var query = new NativeSearchQueryBuilder()
        .withQuery(matchAllQuery())
        .withSourceFilter(new FetchSourceFilter(
            new String[]{"@timestamp", "message", "agent.id"}, 
            new String[]{}))
        .build();
    var searchHits = operations.search(query, SO.class);