I have created a index in elasticsearch with multiple date field and formatted the column as yyyy-mm-dd HH:mm:ss
. Eventually I found the date is malformed and was populating wrong data into the fields. The index has more than 600 000 records and I don't want to leave any data. Now I need to create another field or new index with same date field and format as YYYY-MM-ddTHH:mm:ss.Z
and need to populate all the records into new index or new fields.
I have used the date processor pipeline as below. but it fails. Correct me anything wrong here.
PUT _ingest/pipeline/date-malform
{
"description": "convert malformed date to timestamp",
"processors": [
{
"date": {
"field": "event_tm",
"target_field" : "event_tm",
"formats" : ["YYYY-MM-ddThh:mm:ss.Z"]
"timezone" : "UTC"
}
},
{
"date": {
"field": "vendor_start_dt",
"target_field" : "vendor_start_dt",
"formats" : ["YYYY-MM-ddThh:mm:ss.Z"]
"timezone" : "UTC"
}
},
{
"date": {
"field": "vendor_end_dt",
"target_field" : "vendor_end_dt",
"formats" : ["YYYY-MM-ddThh:mm:ss.Z"]
"timezone" : "UTC"
}
}
]
}
I have created the pipeline and used reindex as below
POST _reindex
{
"source": {
"index": "tog_gen_test"
},
"dest": {
"index": "data_mv",
"pipeline": "some_ingest_pipeline",
"version_type": "external"
}
}
I am getting the below error while running the reindex
"failures": [
{
"index": "data_mv",
"type": "_doc",
"id": "rwN64WgB936y_JOyjc57",
"cause": {
"type": "exception",
"reason": "java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: unable to parse date [2019-02-12 10:29:35]",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "java.lang.IllegalArgumentException: unable to parse date [2019-02-12 10:29:35]",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "unable to parse date [2019-02-12 10:29:35]",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "Illegal pattern component: T"
}
}
You can either use logstash like Shailesh Pratapwar suggested, but you also have the option to use elasticsearch reindex + ingest to do the same:
Create an ingest pipeline with the proper date processor in order to fix the date format/manipulation: https://www.elastic.co/guide/en/elasticsearch/reference/master/date-processor.html
reindex the data from the old index, to a new index, with the date manipulation. from: https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html
Reindex can also use the Ingest Node feature by specifying a pipeline