Search code examples
mongodbaws-documentdbmongodumpmongorestoremongodb4.0

Rename database and indexes during mongorestore


I'm migrating data from a mongo 4.0 server to aws documentdb (docdb4.0 family), and I've run into the error createIndex error: namespace name generated from index name is too long which has been reported with some frequency. However, I have not seen this specific question answered, nor have I found any examples of what I'm trying to accomplish.

I need to rename both the database, and the too-long indexes upon the mongorestore. The old database had a long suffix on it that needs to be removed in the documentdb instance, and the index names are longer than the documentdb limit.

Currently, my mongorestore command looks like this:

mongorestore \
--stopOnError \
-v \
--nsFrom="myapp_long_suffix_here.*" --nsTo="myapp.*" \
--nsFrom="myapp.datadocs.1:metadata.17:duplicate_call.1:duplicate_of_datadoc_id_1" --nsTo="myapp.datadocs.duplicate_of_datadoc_id" \
--nsFrom="myapp.datadocs.1:metadata.3:publisher_id_1_1:metadata.1:created_at_1" --nsTo="myapp.datadocs.publisher_id_created_at" \
--nsFrom="myapp.datadocs.1:metadata.17:duplicate_call.3:duplicate_reason_1" --nsTo="myapp.datadocs.duplicate_reason" \
--nsFrom="myapp.datadocs.2:parsed_message.11:from_addr_sha256_hash_1" --nsTo="myapp.datadocs.from_addr_hash" \
--gzip \
--archive=mongodb-0_myapp_datadocs_2014.archive.gz

The above command renames the database as expected, and restores all of the documents, but fails to create any indexes, failing with the error createIndex error: namespace name generated from index name is too long as noted above.

I have a few questions about this:

  1. Is there a way to know or find out which specific indexes it's referring to?
  2. Is there a way to know how the namespace name that's being generated is being generated (perhaps I'm specifying my index names improperly in the nsFrom/nsTo flags?)
  3. Is it possible that the nsFrom/nsTo flags that rename the database are impacting the ability to use those flags for indices because they use wildcards?

Any other advice or examples of anyone having done this successfully are welcome. Also, if this approach won't work, how would you recommend doing a restore of all documents and indexes in a 1TB+ collection in documentdb 4.0 set up in a replicaset (so, not standalone, which means db.reIndex() won't work, as I understand it)?

Thanks for your help & advice.


Solution

  • The index name max limit in Amazon DocumentDB is 63 characters. See the quotas and limits page from the documentation. If a name for an index is not specified, the name is generated from concatenating the fields of the compound index.

    What I suggest for your case is the following:

    • Get the index tool from Amazon DocumentDB tools repo - https://github.com/awslabs/amazon-documentdb-tools/tree/master/index-tool
    • Export the indexes from MongoDB using the tool
    • Check for issues with: documentdb_index_tool.py --show-issues --dir <directory that contains index metadata dump>
    • If you get the index name greater than 63 characters, then modify the .json files containing the index definitions and adjust the name field accordingly
    • Once issues were fixed, restore indexes to Amazon DocumentDB using the same tool
    • Finally, mongorestore with the --noIndexRestore option

    Pre-creating the indexes will also boost restore times, is much faster this way than creating the indexes on existing data.