Search code examples
elasticsearch

How do to an Elasticsearch ESQL Distinct Query?


In SQL:

select distinct(session_id) from my_table where userid = '20';

What I have in ESQL:

from my_table | where userid == 20

What should I add in the the next pipe so that it returns the unique session ids from the table? This is the link to the ESQL documentation. Which processing command is to be utilised for this?


Solution

  • You can add STATS count(*) BY session_id for distinct(session_id).

    #Push the sample data
    POST /my_table/_bulk
    { "index": { "_index": "my_table", "_id": "1" } }
    { "session_id": 1, "userid": 20, "other_column": "data1" }
    { "index": { "_index": "my_table", "_id": "2" } }
    { "session_id": 2, "userid": 20, "other_column": "data2" }
    { "index": { "_index": "my_table", "_id": "3" } }
    { "session_id": 1, "userid": 20, "other_column": "data3" }
    { "index": { "_index": "my_table", "_id": "4" } }
    { "session_id": 3, "userid": 30, "other_column": "data4" }
    { "index": { "_index": "my_table", "_id": "5" } }
    { "session_id": 4, "userid": 20, "other_column": "data5" }
    { "index": { "_index": "my_table", "_id": "6" } }
    { "session_id": 2, "userid": 20, "other_column": "data6" }
    

    #with DSL query
    POST /my_table/_search
    {
      "size":0,
      "query":{"term":{"userid":20}},
      "aggs":{"unique_session_ids":{"terms":{"field":"session_id"}}}
    }
    

    #with ESQL query
    POST /_query?format=txt
    {
      "query": " FROM my_table | WHERE userid == 20 | STATS count(*) BY session_id "
    }
    

    enter image description here

    Reference: https://www.elastic.co/guide/en/elasticsearch/reference/current/esql-rest.html#esql-rest-params