Search code examples
elasticsearchelasticsearch-dsl

Filter Elasticsearch by day of week and time


I have an index for businesses in Elasticsearch. Each document in the index represent a business, and each business has business_hours. I'm trying to allow a filter on the business hours using the day of the week and the time. For example,we'd like to be able to do a filter for show we all business that are open Tuesday Evenings after 6:00PM I'm thinking that we should have a field with the following mappings:

  {
      "mappings": {
        "properties": {
          
          "business_hours": {
                             "type": "date_range",
                            "format": "w'T'hh:mma"
          }
        }
      }
    }

Each document would then have an array of business_hours. So a store that is open on Monday 9:00AM - 5:00PM, and Tuesday 9:30AM - 5:00PM would look like this:

POST my-index/_doc
    {
      "name": "My Store",
      "business_hours": [
        {
        "gte": "1T09:00AM",
        "lte": "1T05:00PM"
        },
        {
        "gte": "2T09:30AM",
        "lte": "2T05:00PM"
        }
      ]
    }

I tried to search this document and query for it, but the filter for the hours is not working, they look like theyre getting ignored.... Does Elasticsearch support filtering by a day of the week or does it need to be an actual datetime?

Here's the query I used. Its supposed to filter for business open on Wednesday but it returned the document above which only had business hours on Monday and Tuesday

GET my-index/_search
{
  "query": {
    "bool": {
      "filter": [
        {
          "range": {
            "business_hours": {
              "gte": "3T10:00AM",
              "lte": "3T05:00PM",
              "relation": "CONTAINS"
            }
          }
        }
      ]
    }
  }
}

Solution

  • The idea of using a range field is a good one. However, instead of a date_range, which works on absolute dates, I suggest using an integer_range field.

    Since every day contains 1440 minutes, my suggestion would be to encode the opening hours into the number of minutes since midnight and prefix that number with the index of the day (Monday = 1, Tuesday = 2, etc). The formula to transform a given hour into minutes since midnight is pretty easy:

    (60 * HH) + MM 
    
    Note: HH is in 24 hours format, not AM/PM, but that's a detail
    

    Taking your example above, it would yield this:

    POST my-index/_doc
    {
      "business_hours": [
        {
          "gte": 10540,       <--- Monday (1), 540 minutes after midnight
          "lte": 11020        <--- Monday (1), 1020 minutes after midnight
        },
        {
          "gte": 20570,       <--- Tuesday (2), 570 minutes after midnight
          "lte": 21020        <--- Tuesday (2), 1020 minutes after midnight
        }
      ]
    }
    

    That way the range query becomes an easy one which gets rid of any date-related issues. For instance, the query below retrieves the document above by searching for businesses that are open on Mondays between 6AM and 5PM

    GET my-index/_search
    {
      "query": {
        "bool": {
          "filter": [
            {
              "range": {
                "business_hours": {
                  "gte": "10600",         <--- Monday (1), 600 minutes after midnight
                  "lte": "11020",         <--- Monday (1), 1020 minutes after midnight
                  "relation": "CONTAINS"
                }
              }
            }
          ]
        }
      }
    }