Search code examples
migrationdruid

How to migrate Apache Druid data between 2 instances?


We have 2 druid instances one for Stage and Data validation and another for Production. Once data is loaded and validated on stage instance we need to migrate it to production. Is there a way we can migrate data directly to other instances instead of reloading?


Solution

  • Well, in theory the only thing you need is the segments data records and the raw data files. If you store your metadata in (for example) MySQL, you can export the records from the druid_segments table.

    The druid_segments record will also show you where the segment file is stored (see the payload column.

    You now should copy the data files to the location which is used in production. Make sure that the payload column "points" to this correct location.

    Now import the records in your production environment and you should be settled.

    Before applying this in production please test this in a test environment.

    Maybe this page will help you along. It contains useful information for your situation: https://support.imply.io/hc/en-us/articles/115004960053-Migrate-existing-Druid-Cluster-to-a-new-Imply-cluster