Say I have:
PUT /test/_doc/1
{
"user" : "kimchy",
"post_date" : "2009-11-15T14:12:12",
"message" : "trying out Elasticsearch",
"data": {
"modified_date": "2018-11-15T14:12:12",
"password": "abcpassword"
}
}
Then I get the following mapping:
GET /test/_mapping/_doc
{
"test": {
"mappings": {
"_doc": {
"properties": {
"data": {
"properties": {
"modfied_date": {
"type": "date"
},
"password": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"message": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"post_date": {
"type": "date"
},
"user": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
How can I reindex the mapping to bring modified_date
to the same level as user
and not lose any data?
{
"user" : "kimchy",
"post_date" : "2009-11-15T14:12:12",
"message" : "trying out Elasticsearch",
"modified_date": "2018-11-15T14:12:12"
"data": {
"password": "abcpassword"
}
}
I'd suggest using Ingest Node and Pipelines. You can read about them in the links added respectively.
Basically what I will do is, construct a pipeline
and mention it during indexing
or reindexing
process so that your document would go through the pre-processing as defined in the pipeline before document is actually stored in the destination index.
I've created below pipeline for your use case. What it does is, adds a new field modified_date
with value as required and removed field data.modified_date
. If any fields are not mentioned in it, it would not be modified and would be ingested in destination index as is.
PUT _ingest/pipeline/mydatepipeline
{
"description" : "modified date pipeline",
"processors" : [
{
"set" : {
"field": "modified_date",
"value": "{{data.modified_date}}"
}
},
{
"remove": {
"field": "data.modified_date"
}
}
]
}
Once above pipeline is created, make use of it to perform reindexing.
POST _reindex
{
"source": {
"index": "test"
},
"dest": {
"index": "test_dest",
"pipeline": "mydatepipeline"
}
}
The documents would be transformed as what you expect it to be and would be indexed in test_dest
index. Note that you need to explicitly create the test_dest
with the mapping details as per your requirement.
You can use it during bulk operation as follows:
POST _bulk?pipeline=mydatepipeline
PUT test/_doc/1?pipeline=mydatepipeline
{
"user" : "kimchy",
"post_date" : "2009-11-15T14:12:12",
"message" : "trying out Elasticsearch",
"data": {
"modified_date": "2018-11-15T14:12:12",
"password": "abcpassword"
}
}
For both Usage 2 and 3
, you need to ensure your mapping is created accordingly.
Hope this helps!