Search code examples
xquerymarklogic

marklogic data copy from one forest to multiple forest


I need to copy the Marklogic DB contents (50 million xml docs) from one DB host to another. We can do this by doing a backup/restore. But i need to copy the data available in two forests (25 million each) to 20 forests (2.5 million each) and distribute them evenly. can this be done using xqsync or any other utilities?


Solution

  • As wst indicates, Marklogic 7 will do that automatically for you by default for new databases. For databases that you upgrade from earlier versions, you need to enable rebalancing manualy from Admin interface. You can find that setting on the Database Configure tab, near the bottom.

    After that, you just add new forests as needed to your database, and redistribution is automatically triggered after a slight delay (based on a throttle-level like reindexer), also accross a cluster. You can follow rebalancing from the Database Status page in the Admin interface. May take a while though, it is designed to run with low interference on background.

    The other way around is almost as easy. Go to Forests page under Database, and select 'retired' next to the forest you want to remove. This automatically triggers rebalancing documents away from that forest. Once that is done, you just detach it from the Database.

    All data is fully searchable and accessible during all this, though response times can be relatively slow, as caches need to be refreshed as well.

    HTH