Search code examples
elasticsearchelasticsearch-painless

With ElasticSearch sort by given array


I'm storing in ElasticRearch a series of feeds. Each feed is has the actor posting such feed and the posting date. In another place I store a weighted value for each actor in this way:

weights: [{'id': 'mark', 'weight': 1}, {'id': 'jane', 'weight': 3}]

I need to query the feeds grouped by date but ordered by such weights. I tried to make a sorting function using painless but I'm stuck in defining the weights:

{
    "size": 0,
    "query": {
        "bool": {
            "should": [
                {
                    "bool": {
                        "must": [
                            {
                                "term": {
                                    "actor.id": "mark"
                                }
                            },
                            {
                                "range": {
                                    "published": {"gte": "2017-09-30T15:37:21.530483"}
                                }
                            }
                        ]
                    }
                },
                {
                    "bool": {
                        "must": [
                            {
                                "term": {
                                    "actor.id": "jane"
                                }
                            },
                            {
                                "range": {
                                    "published": {"gte": "2017-09-30T15:37:21.530483"}
                                }
                            }
                        ]
                    }
                }
            ]
        }
    },
    "aggs": {
        "dates": {
            "terms": {
                "field": "published_date",
            },
            "aggs": {
                "top_verbs_hits": {
                    "top_hits": {
                        "sort": {
                            "_script": {
                                "type": "number",
                                "script": {
                                    "lang": "painless",
                                    "source": "def weights = [{'id': 'mark', 'weight': 1}, {'id': 'jane', 'weight': 3}]; def weight = 0; for (int i = 0; i < weights.length; ++i) { if (weights[i].id == doc.actor.id) return weights[i].weight; } return weight;"
                                },
                                "order": "asc"
                            }
                        },
                        "_source": {
                            "includes": ["published", "actor", "object", "target", "extra"]
                        },
                        "size": 100
                    }
                }
            }
        }
    },
    "sort": [
        {
            "published": {
                "order": "desc"
            }
        }
    ],
}

For clarity the painless function is as follow:

def weights = [{'id': 'mark', 'weight': 1}, {'id': 'jane', 'weight': 3}]; 
def weight = 0; 
for (int i = 0; i < weights.length; ++i) 
{ 
    if (weights[i].id == doc.actor.id) 
    return weights[i].weight; 
} 
return weight;

Elastic give me a compile error near the definition of the array. My guess is that I cannot define a list/array of JSON objects:

compile error","script_stack":["def weights = [{'id': 'mark', 'weight ...","               ^---- HERE"]....

Is there any way to accomplish this with or without a sorting script?


Solution

  • Painless is not a javascript-like language. You can't just define an Array with a JSON-like syntax.

    You can have the full documentation here for array. Also you have a create Map to represent your JSON objects.

    But in your case you should definitively use scripts params

    Could you try something like :

    "sort": {
        "_script": {
            "type": "number",
            "script": {
                "lang": "painless",
                "source": "def weight = 0; for (int i = 0; i < params.weights.length; ++i) { if (params.weights[i].id == doc['actor.id'].value) return params.weights[i].weight; } return weight;"
                "params": {
                  "weights" :[{'id': 'mark', 'weight': 1}, {'id': 'jane', 'weight': 3}]
                } 
            },
            "order": "asc"
        }
    }
    

    By using params, you can defined your entry data with a JSON syntax AND furthemore you allow elasticsearch to cache the compiled version of your script, since the source will remain the same even if the weights array changes.