I have this mapping:
{
"foos": {
"mappings": {
"foo": {
"dynamic": "false",
"properties": {
"some_id": {
"type": "integer"
},
"language": {
"type": "text"
},
"locations": {
"type": "integer"
},
"name": {
"type": "text",
"term_vector": "yes",
"analyzer": "name_analyzer"
},
"popularity": {
"type": "integer"
},
"some_deep_count": {
"type": "object"
}
}
}
}
}
}
one example entry is the following:
{
"name": "Some nice name",
"some_id": 1,
"id": 4378,
"popularity": 525,
"some_deep_count": {
"0": {
"32026": 344,
"55625": 458,
"29": 1077,
"55531": 1081,
...
},
"1": {
"32026": 57,
"55625": 60,
"29": 88,
...
}
},
"locations": [
32026,
55625,
...
],
"language": [
"es",
"en"
]
}
where the some_deep_count
field can contain only the "0" and "1" keys, which inside of them can contain a very long list of id => value (dynamic, not configurable in advance)
This works very well when filtering:
"_source": [
"id",
"some_deep_count.*.55529"
],
but I don't understand how to sort by any deep object. I need a deep sum, like the following expression indicates:
...
{
"sort": {
"_script": {
"type": "number",
"script": {
"lang": "painless",
"source": "def deep0 = 0; def deep1 = 0; if(doc.containsKey('some_deep_count.0.55529')) { deep0 = doc['some_deep_count.0.55529'] } if(doc.containsKey('some_deep_count.1.55529')) { deep1 = doc['some_deep_count.1.55529'] } return deep0 + deep1"
},
"order": "desc"
}
}
}
which unfortunately always returns 0
in the sorting field, since doc.containsKey('some_deep_count.0.55529')
results always being empty. Also doc.containsKey('some_deep_count')
does.
Interestingly enough, doc.containsKey('some_id')
would work, and I really can't understand why
EDIT
In response to Val's proposal, I'm attaching a full request / response
Request:
{
"sort": {
"_script": {
"type": "number",
"script": {
"lang": "painless",
"source": "def ps0 = 0; if(doc.containsKey('some_deep_count.0.55529')) { ps0 = doc['some_deep_count.0.55529'].value; } return ps0 "
},
"order": "desc"
}
},
"_source": [
"id",
"some_deep_count.0.55529"
],
"size": 1
}
Response:
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 2121,
"max_score": null,
"hits": [
{
"_index": "foos",
"_type": "foo",
"_id": "5890",
"_score": null,
"_source": {
"some_deep_count": {
"0": {
"55529": 228
}
},
"id": 5890
},
"sort": [
0.0
]
}
]
}
}
The issue is probably to be found in the condition. In fact I get "sort":[0.0]
even when the sorting is as simple as "def ps0 = 0; if(doc.containsKey('some_deep_count')) { ps0 = 99999; } return ps0 "
, revealing that the clause doc.containsKey('some_deep_count')
might have some problem
EDIT2
The index got with curl -XGET localhost:9200/foos
looks like the following:
{
"foos": {
"aliases": {},
"mappings": {
"foo": {
"dynamic": "false",
"properties": {
"some_id": {
"type": "integer"
},
"language": {
"type": "text"
},
"locations": {
"type": "integer"
},
"name": {
"type": "text",
"term_vector": "yes",
"analyzer": "name_analyzer"
},
"popularity": {
"type": "integer"
},
"some_deep_count": {
"type": "object"
}
}
}
},
"settings": {
"index": {
"number_of_shards": "5",
"provided_name": "foos",
"creation_date": "1576168104248",
"analysis": {
"analyzer": {
"name_analyzer": {
"filter": [
"lowercase"
],
"tokenizer": "keyword"
}
}
},
"number_of_replicas": "0",
"uuid": "26xckWaOQuuxFrMvIdikvw",
"version": {
"created": "6020199"
}
}
}
}
}
Thank you
I've been able to return a non-zero sort value by reproducing your case as shown below:
# 1. create the index mapping
PUT sorts
{
"mappings": {
"properties": {
"some_deep_count": {
"type": "object"
}
}
}
}
# 2. index a sample document
PUT sorts/_doc/1
{
"some_deep_count": {
"0": {
"29": 1077,
"32026": 344,
"55531": 1081,
"55625": 458
},
"1": {
"29": 88,
"32026": 57,
"55625": 60
}
}
}
# 3. Sort the results
POST sorts/_search
{
"sort": {
"_script": {
"type": "number",
"script": {
"lang": "painless",
"source": """
def deep0 = 0;
def deep1 = 0;
if(doc.containsKey('some_deep_count.0.55531')) {
deep0 = doc['some_deep_count.0.55531'].value;
}
if(doc.containsKey('some_deep_count.1.55531')) {
deep1 = doc['some_deep_count.1.55531'].value;
}
return deep0 + deep1;
"""
},
"order": "desc"
}
}
}
Results => sort = 1081
"hits" : [
{
"_index" : "sorts",
"_type" : "_doc",
"_id" : "1",
"_score" : null,
"_source" : {
"some_deep_count" : {
"0" : {
"29" : 1077,
"32026" : 344,
"55531" : 1081,
"55625" : 458
},
"1" : {
"29" : 88,
"32026" : 57,
"55625" : 60
}
}
},
"sort" : [
1081.0
]
}
]
As you can see I'm using 55531
which exists in some_deep_count.0
but not in some_deep_count.1
and the result is 1081, which is correct.
Using *.29
which is available in both 0 and 1 would yield 1165
which is also correct (1077 + 88).
The only difference between my script and yours is that when assigning deep0
and deep1
you need to add .value
to the doc field reference.
UPDATE
The problem lies in the fact that you're specifying dynamic: false
in your mapping. Using that parameter means that if you index new fields that weren't present in your mapping at index creation time, your mapping won't be updated. So as it stands, all the sub-fields that you index in some_deep_count
are never indexed, which is why you're always getting 0. Remove dynamic: false
and everything will work as expected.