Search code examples
pythonelasticsearchelasticsearch-dslelasticsearch-dsl-py

Elasticsearch boolean facets returned as wrong type


I'm using ES v5.1.2 and having an issue with the facets returning incorrect types for boolean fields. Here's a minimal setup to reproduce and demonstrate the problem:

from elasticsearch_dsl import DocType, FacetedSearch, TermsFacet
from elasticsearch_dsl.field import Keyword, Integer, Boolean

class Post(DocType):
    comment = Keyword()
    likes = Integer()
    published = Boolean()
    class Meta:
        index = 'blog'

class PostSearch(FacetedSearch):
    index = 'blog'
    doc_types = [Post]
    fields = 'comment', 'likes', 'published'
    facets = {k: TermsFacet(field=k) for k in fields}

Now create some documents in the index, and execute a faceted search:

>>> Post.init()
>>> Post(comment='potato', likes=42, published=True).save()
True
>>> Post(comment='spud', likes=12, published=False).save()
True
>>> Post(comment='foo', likes=7, published=True).save()
True
>>> search = PostSearch()
>>> response = search.execute()

The individual response data looks correct:

>>> response.hits.total
3
>>> vars(response[0])
{'_d_': {u'comment': u'spud', u'likes': 12, u'published': False},
 'meta': {u'index': u'blog', u'score': 1.0, u'id': u'AVofDCdDpUlHAgmQ...}}
>>> response[0].published
False

That is, we have deserialized Python booleans on the search results. However, the data in the aggregations is incorrect:

>>> response.facets.to_dict()
{'comment': [(u'foo', 1, False), (u'potato', 1, False), (u'spud', 1, False)],
 'likes': [(7, 1, False), (12, 1, False), (42, 1, False)],
 'published': [(1, 2, False), (0, 1, False)]}

The facets should be 3-tuples of (value, count, selected). But boolean values come back as 1 and 0, they weren't deserialized, so the frontend and my templates are not able to distinguish an integer type from a boolean type. To summarise, the expected and actual behaviour are shown below:

Actual behaviour:

>>> response.facets['published']
[(1, 2, False), (0, 1, False)]

Expected behaviour:

>>> response.facets['published']
[(True, 2, False), (False, 1, False)]

What am I doing wrong here? How can we make the facet values for a Boolean field deserialize correctly in the facets, as they do in the actual search results?


Solution

  • This is a bug that has been fixed in https://github.com/elastic/elasticsearch-dsl-py/issues/583