Search code examples
elasticsearchgithub-actionskibanawebhooks

How to join indices in elasticsearch and kibana


I know I am thinking of this in SQL terms so I know there is going to be a shift in thinking.

Basically I am ingesting webhooks from GitHub about actions and I want to pull some data together for a visualization.

I want to be able to join these two indices together such that I can pull the path data from the workflow_run index that corresponds to the entries in the workflow_job index so that I can show the file associated with the job entries. Ultimately I am going to group them together and list the top jobs that fail.

A sample of the indices is here:

workflow_run index sample workflow_job index sample

What is the correct way to visualize this data in Kibana?


Solution

  • I know I am thinking of this in SQL terms so I know there is going to be a shift in thinking.

    TL;DR: That is correct. If you want to use Kibana you need to denormalize everything. If it is not in a single index with homogeneous records you are not going to be a happy camper.

    There is pretty much no support for many-to-many relationships here. Neither Kibana nor Elasticsearch can handle them. There is a limited support for one-to-many relationships in Elasticserch. If you could make it one-to-many relationship (for example, you could create a single workflow record) then Elasticsearch could handle one-to-many relationships using join field or nested records. There are several queries. Please note that all records in this case have to reside in the same index in case of join fields or even the same record in case of nested fields.

    Unfortunately neither nested fields nor [join fields] (https://github.com/elastic/kibana/issues/3730) are properly supported by Kibana at the moment. There are some features like querying nested fields available, but overall most of your queries and aggregation will be either not possible or will break unexpectedly.