Search code examples
elasticsearchelastic4s

How do I log all queries in embedded ElasticSearch?


I'm trying to debug an ElasticSearch query. I've enabled explain for the problematic query, and that is showing that the query is doing a product of intermediate scores where it should be doing a sum. (I'm creating the query request using elastic4s.)

The problem is I cannot see what the generated query actually is. I want to determine whether the bug is in elastic4s (generating the query request incorrectly), in my code, or in elasticsearch. So I've enabled logging for the embedded elasticsearch instance used in the tests using the following code:

ESLoggerFactory.setDefaultFactory(new Slf4jESLoggerFactory())
val settings = Settings.settingsBuilder
  .put("path.data", dataDirPath)
  .put("path.home", "/var/elastic/")
  .put("cluster.name", clusterName)
  .put("http.enabled", httpEnabled)
  .put("index.number_of_shards", 1)
  .put("index.number_of_replicas", 0)
  .put("discovery.zen.ping.multicast.enabled", false)
  .put("index.refresh_interval", "10ms")
  .put("script.engine.groovy.inline.search", true)
  .put("script.engine.groovy.inline.update", true)
  .put("script.engine.groovy.inline.mapping", true)
  .put("index.search.slowlog.threshold.query.debug", "0s")
  .put("index.search.slowlog.threshold.fetch.debug", "0s")
  .build

but I can't find any queries being logged in the log file configured in my logback.xml. Other log messages from elasticsearch are appearing there, just not the actual queries.


Solution

  • In the specific case of elastic4s, it offers the ability to call .show on the elastic4s query object to generate what the JSON body part of the request would have been if the JSON-over-HTTP protocol had been used to send the request, for most types of request. This can then be logged at a convenient point in your code, e.g. if you have one method that generates all ES search queries. The code in Elasticsearch that generates the fake JSON could still have bugs of course, so it should not entirely be trusted. However, it's worth trying to reproduce the issue with the output of .show using Sense against a real Elasticsearch cluster over HTTP - if you can, you (a) know that it's not an elastic4s bug, and (b) can easily manipulate the JSON to try to figure out what's causing the problem.

    show calls toString in some cases, so with the plain Elasticsearch API or another JVM-based wrapper on top of it, you can call that to get the JSON string to log.

    With embedded Elasticsearch, this is as good as you're going to get in terms of logging - short of putting a breakpoint on the builder invocations and observing the actual Java Elasticsearch request objects that are created (which is the most accurate approach).