Search code examples
amazon-web-serviceselasticsearch

How can I exclude results from elasticsearch based on the contents of a field?


I'm using elasticsearch on AWS to store logs from Cloudfront. I have created a simple query that will give me all entries from the past 24h, sorted from new to old:

{
  "from": 0,
  "size": 1000,
  "query": {
    "bool": {
      "must": [
        { "match": { "site_name": "some-site" } }
      ],
      "filter": [
        { 
          "range": { 
            "timestamp": {
              "lt": "now",
              "gte": "now-1d"
            }
          }
        }
      ]
    }
  },
  "sort": [
    { "timestamp": { "order": "desc" } }
  ]
}

Now, there a are certain sources (based on the user agent) for which I would like to exclude results. So my question boils down to this:

How can I filter out entries from the results when a certain field contains a certain string? Or:

query.filter.where('cs_user_agent').does.not.contain('Some string')

(This is not real code, obviously.)

I have tried to make sense of the Elasticsearch documentation, but I couldn't find a good example of how to achieve this.

I hope this makes sense. Thanks in advance!


Solution

  • Okay, I figured it out. What I've done is use a Bool Query in combination with a wildcard:

    {
      "from": 0,
      "size": 1000,
      "query": {
        "bool": {
          "must": [
            { "match": { "site_name": "some-site" } }
          ],
          "filter": [
            { 
              "range": { 
                "timestamp": {
                  "lt": "now",
                  "gte": "now-1d"
                }
              }
            }
          ],
          "must_not": [
            { "wildcard": { "cs_user_agent": "some string*" } }
          ]
        }
      },
      "sort": [
        { "timestamp": { "order": "desc" } }
      ]
    }
    

    This basically matches any user agent string containing "some string", and then filters it out (because of the "must_not").

    I hope this helps others who run into this problem.