I have a nested field in my index that contains several objects.
"customFields" : [
{
"objectTypeId" : 17,
"Value" : "",
"description" : "The original author of the document",
"Name" : "Document Author"
},
{
"objectTypeId" : 17,
"Value" : "",
"description" : "Source document number",
"Name" : "Legacy document number"
},
.
.
.
]
I want to create a script that can move the fields out from the customFields object into seperate objects like this:
"Document_Author": {
"Description": "The original author of the document",
"Value": "Some value"
"ObjectTypeId": 17
},
"Legacy document number": {
"Description": "Source document number",
"Value": "Some value"
"ObjectTypeId": 17
},
.
.
.
I tried a script like this, i am very new to elastic search and scripting, so this does not work.
POST /new_document-20/_update_by_query
{
"script" : { "inline": "for (int i = 0; i < ctx._source.customFields.length; ++i) { ctx._source.add(\"customFields[i].Name\" : { \"Value\" : \"customFields[i].Value\", \"Description\" : \"customFields[i].description\", \"objectTypeId\" : \"customFields[i].objectTypeId\"}) }",
"query": {
"bool": {
"must": [
{
"exists": {
"field": "customFields.Name"
}
}
]
}
}
}
}
I get compilation errors from this pointing to customFields[i].Name
Like this:
"error": {
"root_cause": [
{
"type": "script_exception",
"reason": "compile error",
"script_stack": [
"... d(\"customFields[i].Name\" : { \"Value\" : \"customFiel ...",
" ^---- HERE"
How can I create a script that helps me move the fields out from the nested object?
You can perform only one ctx._source
write operation per loop to prevent the "The maximum number of statements that can be executed in a loop has been reached."
error.
With that being said, I'd suggest to:
_source
customFields
listsource
_source
fullyIn practical terms:
POST /new_document-20/_update_by_query
{
"script": {
"inline": """
def source_copy = ctx._source;
def customFields = source_copy.remove('customFields');
for (int i = 0; i < customFields.length; i++) {
// store the current iteratee
def current = customFields[i];
// remove AND return the name
def name = current.remove('Name');
// set in the _source
source_copy[name] = current;
}
// replace the original source completely
ctx._source = source_copy;
""",
"query": {
"bool": {
"must": [
{
"exists": {
"field": "customFields.Name"
}
}
]
}
}
}
}
And as an inline script string:
"\n def source_copy = ctx._source;\n def customFields = source_copy.remove('customFields');\n \n for (int i = 0; i < customFields.length; i++) {\n // store the current iteratee\n def current = customFields[i];\n \n // remove AND return the name\n def name = current.remove('Name');\n \n // set in the _source\n source_copy[name] = current;\n }\n \n // replace the original source completely\n ctx._source = source_copy;\n "
By the way, hash maps in Painless are instantiated either through a new HashMap
call or through the (slightly confusing) [:]
operator, i.e.:
def entries_map_without_name = [
"Value" : current.Value,
"Description" : current.description,
"objectTypeId" : current.objectTypeId
];
P.S. The conversion from a nested list of objects to a bunch of hash maps that you were trying to perform has its advantages and disadvantages, esp. when it comes to the mapping size bloat and the quite limited aggregation possibilities.
Shameless plug -- I discuss just that my Elasticsearch Handbook, specifically in this sub-chapter.