Search code examples
sortingelasticsearchelasticsearch-dsl

Elasticsearch conditional sort on date field


I am trying to sort an Elastic Search query result on a date field, registeredAt. However, registeredAt doesn't exist in all documents returned. In that case, I would want the sort to look for the date on an alternative field, invitedAt.

If we have 3 hits which look like this:


    hits = [
    {
       id: 'hit2'
       registeredAt: '2021-06-01T23:00:00.000Z',
       invitedAt: '2021-05-31T23:00:00.000Z'
    },
    {
       id: 'hit3'
       invitedAt: '2021-05-31T23:00:00.000Z'
    },
    {
       id: 'hit1'
       invitedAt: '2021-06-04T23:00:00.000Z'
    },
    
    ],

then I would want the sort to return them in order from most recent to least recent: [hit1, hit2, hit3]. In each document, the sort script should look for the registeredAt field and take that date as the sort value and, if that field does not exist, look at the value for invitedAt and take that as the sort value. In that sense, hit1 does not have a registeredAt and has the most recent date for invitedAt and, as such, should come first. hit2 has a registeredAt field and the date for that field is more recent than the invitedAt date of hit3 (which doesn't have a registeredAt field.

I have written the query as such:

client.search({
        index: 'users',
        track_total_hits: true,
        sort: {
        _script: {
          type: 'number',
          script: {
            lang: 'painless',
            source:
              "if (!doc.containsKey('registeredAt') || doc['registeredAt'].empty) { return doc['invitedAt'].value; } else { return doc['registeredAt'].value }",
          },
          order: 'desc',
        },
      },
        body: {
          from: skip,
          size: limit,
          query: {...},
        },
      })
      

The query runs without errors but the sorting does not work and the documents are returned in the order that they were indexed in.


Solution

  • I assume that registeredAt and invitedAt are date in the mapping. This query should work. What I added is calling .getMillis() after getting the value.

    {
      "sort": [
        {
          "_script": {
            "type": "number",
            "script": {
              "lang": "painless",
              "source": """
                if (!doc.containsKey('registeredAt') || doc['registeredAt'].empty) { 
                  return doc['invitedAt'].value.getMillis();
                  
                } 
                else { 
                  return doc['registeredAt'].value.getMillis();
                }
              """
            },
            "order": "desc"
          }
        }
      ]
    }
    

    Edit: .getMillis() is depricated in version 7.x. .toInstant().toEpochMilli() should be used instead.

    This is the query:

    {
      "sort": [
        {
          "_script": {
            "type": "number",
            "script": {
              "lang": "painless",
              "source": """
                if (!doc.containsKey('registeredAt') || doc['registeredAt'].empty) { 
                  return doc['invitedAt'].value.toInstant().toEpochMilli();
                  
                } 
                else { 
                  return doc['registeredAt'].value.toInstant().toEpochMilli();
                }
              """
            },
            "order": "desc"
          }
        }
      ]
    }