I am using, AWS Elasticsearch service(version 6.3). I am interested in changing mapping while re-indexing data from current_index
to new_index
. I am not trying to upgrade from older Elasticsearch clusters to new one. Both my current_index
and new_index
are on the same Elasticsearch 6.3 cluster.
I am trying to perform Reindex in place operation by following the information from Elastic documentation
My index contains about 250k searchable documents. When I POST _reindex
request using curl,
curl -X POST "aws_elasticsearch_endpoint/_reindex" -H 'Content-Type: application/json' -d'
{
"source": {
"index": "current_index"
},
"dest": {
"index": "new_index"
}
}
'
Elasticsearch starts the reindex process(I verify this by performing GET /_cat/indices?v
), and I end up getting curl: (56) Unexpected EOF
error. The Reindex operation actually works fine. After about 2 hours the doc.count
in new_index
matches that of current_index
and status
turns green
If I POST _reindex
from Java, I get this error:
java.net.SocketException: Unexpected end of file from server
Only when the document size in my index is small(I tried with like 1k searchable documents) is when the Reindex API returns success-fully as specified here
This is because the response takes a long time to return and curl times out. On small data sets, the response comes back before the time out, hence why you're getting a response.
When curl times out, the reindex is still in progress, though, and you can still see how the reindex is doing using this command:
GET _tasks?actions=*reindex&detailed=true
What you can also do is to add ...?wait_for_completion=false
to your curl command. ES will create a background task for your reindex operation. The curl command will terminate early and return a taskId
that you can then use to regularly check the state of the reindex using the Task API
GET .tasks/task/<taskId>
Also note that in this case, when the task is done, you'll also need to remove the task from the .tasks
index, ES will not do it for you.