I have two indexes: twitter and reitwitter
twitter has multiple documents across different types like:
"hits": [
{
"_index": "twitter",
"_type": "tweet",
"_id": "1",
"_score": 1,
"_source": {
"message": "trying out Elasticsearch"
}
},
{
"_index": "twitter",
"_type": "tweet2",
"_id": "1",
"_score": 1,
"_source": {
"message": "trying out Elasticsearch2"
}
},
{
"_index": "twitter",
"_type": "tweet1",
"_id": "1",
"_score": 1,
"_source": {
"message": "trying out Elasticsearch1"
}
}
]
Now, when I reindex, I wanted to get rid of all the different types and just use one because essentially they have the same field mappings.
I tried several different combinations but I always only get one document instead of those three: Approach 1:
POST _reindex/
{
"source": {
"index": "twitter"
}
,
"dest": {
"index": "reitwitter",
"type": "reitweet"
}
}
Response:
{
"took": 12,
"timed_out": false,
"total": 3,
"updated": 3,
"created": 0,
"deleted": 0,
"batches": 1,
"version_conflicts": 0,
"noops": 0,
"retries": {
"bulk": 0,
"search": 0
},
"throttled_millis": 0,
"requests_per_second": -1,
"throttled_until_millis": 0,
"failures": []
}
Note : It says updated 3 because this was the second time I made the same call I guess?
Second approach:
POST _reindex/
{
"source": {
"index": "twitter",
"query": {
"match_all": {
}
}
}
,
"dest": {
"index": "reitwitter",
"type": "reitweet"
}
}
Same response as first one.
In both cases when I make this GET call:
GET reitwitter/_search
{
"query": {
"match_all": {
}
}
}
I only get one document:
{
"_index": "reitwitter",
"_type": "reitweet",
"_id": "1",
"_score": 1,
"_source": {
"message": "trying out Elasticsearch1"
}
Is this use case even supported by reindex ? If not, do I have to write a script using scan and scroll to get all the documents from source index and reindex them with same doc type in destination?
PS: I don't want to use "_source": ["tweet1", "tweet"] because I have around million doc type which have one document each that I want to map to the same doc type in the destination.
The problem is that all the documents has the same id(1), and then they are overriding themselves during the re-index process.
Try to index your documents with different ids and you will see it works.