Search code examples
azurecloudazure-data-explorer

Azure Data Explorer cluster Migration


I accidentally created ADEx cluster without zonal redundancy is it possible to change it now if not can someone guide me on moving the data from existing cluster to another cluster that is ZR.

it will be better if someone gave me steps on achieving the goal thanks in advance


Solution

  • As per the Documenation,

    We cannot modify the Zone Redundancy property once the ADX cluster was created.

    To move data from one cluster to another, Azure Data Factory Pipeline can be one option. But for this method, the required databases in the target cluster should be created prior to the pipeline run.

    First create two Azure Data Explorer linked services in ADF. One for the source cluster and another for the target cluster and make sure you have required permission as mentioned in the above documentation.

    Here my source cluster name is rakeshadx1 which contains the below databases and tables.

    rakeshadx1 -> table1
               -> table2
               -> table3
               -> table4
    rakeshadx2 -> table1
               -> table2
               -> table3
               -> table4
    

    My target cluster name is rakeshcluster2 and I have created 2 databases in it with same name as above.

    While creating the two ADX linked services, use linked service parameters for the database name. We are using this to change the database name dynamically inside loop.

    enter image description here

    Here, we are using two level pipelines (ADF doesn't support nested loops). In parent pipeline, we are getting the database list from source cluster using .show databases script in Azure Data Explorer Command activity(give any source cluster database name for the linked service parameter).

    enter image description here

    Pass this list to ForEach activity and inside ForEach, we are getting the list of tables in current database using .show tables script in another ADX command activity(Here, pass the database name @item().DatabaseName to the linked service parameter). This will give the list of tables and along with the current database name as array of JSON.

    enter image description here

    In child pipeline, create an array parameter and pass the above generated array from ADX activity @activity('ADX get tables from db').output.value to that using Execute pipeline activity.

    enter image description here

    In the child pipeline, use the array parameter for the ForEach activity and inside foreach activity, use another ADX linked service but this is for the target cluster. Pass the current database name @item().DatabaseName to the linked service parameter and use the below dynamic content which creates new table in the current database of the target cluster.

    @concat('.set ',item().TableName,' <| cluster("rakeshadx1.eastus").database("',item().DatabaseName,'").',item().TableName)
    

    enter image description here

    Make sure the Sequential checkbox is checked in the Foreach activities.

    Execute the Parent pipeline by clicking on debug and your Source cluster tables will be copied to the target cluster.

    enter image description here

    Use the below pipeline JSONs to implement the above method.

    My Parent pipeline JSON:

    {
        "name": "parent",
        "properties": {
            "activities": [
                {
                    "name": "Azure Data Explorer Command1",
                    "type": "AzureDataExplorerCommand",
                    "dependsOn": [],
                    "policy": {
                        "timeout": "0.12:00:00",
                        "retry": 0,
                        "retryIntervalInSeconds": 30,
                        "secureOutput": false,
                        "secureInput": false
                    },
                    "userProperties": [],
                    "typeProperties": {
                        "command": ".show databases",
                        "commandTimeout": "00:20:00"
                    },
                    "linkedServiceName": {
                        "referenceName": "AzureDataExplorer1",
                        "type": "LinkedServiceReference",
                        "parameters": {
                            "database_name": "rakeshadx1"
                        }
                    }
                },
                {
                    "name": "ForEach1",
                    "type": "ForEach",
                    "dependsOn": [
                        {
                            "activity": "Azure Data Explorer Command1",
                            "dependencyConditions": [
                                "Succeeded"
                            ]
                        }
                    ],
                    "userProperties": [],
                    "typeProperties": {
                        "items": {
                            "value": "@activity('Azure Data Explorer Command1').output.value",
                            "type": "Expression"
                        },
                        "isSequential": true,
                        "activities": [
                            {
                                "name": "Execute Pipeline1",
                                "type": "ExecutePipeline",
                                "dependsOn": [
                                    {
                                        "activity": "ADX get tables from db",
                                        "dependencyConditions": [
                                            "Succeeded"
                                        ]
                                    }
                                ],
                                "userProperties": [],
                                "typeProperties": {
                                    "pipeline": {
                                        "referenceName": "Child",
                                        "type": "PipelineReference"
                                    },
                                    "waitOnCompletion": true,
                                    "parameters": {
                                        "tables_arr": {
                                            "value": "@activity('ADX get tables from db').output.value",
                                            "type": "Expression"
                                        }
                                    }
                                }
                            },
                            {
                                "name": "ADX get tables from db",
                                "type": "AzureDataExplorerCommand",
                                "dependsOn": [],
                                "policy": {
                                    "timeout": "0.12:00:00",
                                    "retry": 0,
                                    "retryIntervalInSeconds": 30,
                                    "secureOutput": false,
                                    "secureInput": false
                                },
                                "userProperties": [],
                                "typeProperties": {
                                    "command": ".show tables",
                                    "commandTimeout": "00:20:00"
                                },
                                "linkedServiceName": {
                                    "referenceName": "AzureDataExplorer1",
                                    "type": "LinkedServiceReference",
                                    "parameters": {
                                        "database_name": {
                                            "value": "@item().DatabaseName",
                                            "type": "Expression"
                                        }
                                    }
                                }
                            }
                        ]
                    }
                }
            ],
            "annotations": []
        }
    }
    

    Child pipeline JSON:

    {
        "name": "Child",
        "properties": {
            "activities": [
                {
                    "name": "ForEach1",
                    "type": "ForEach",
                    "dependsOn": [],
                    "userProperties": [],
                    "typeProperties": {
                        "items": {
                            "value": "@pipeline().parameters.tables_arr",
                            "type": "Expression"
                        },
                        "isSequential": true,
                        "activities": [
                            {
                                "name": "Azure Data Explorer Command2",
                                "type": "AzureDataExplorerCommand",
                                "dependsOn": [],
                                "policy": {
                                    "timeout": "0.12:00:00",
                                    "retry": 0,
                                    "retryIntervalInSeconds": 30,
                                    "secureOutput": false,
                                    "secureInput": false
                                },
                                "userProperties": [],
                                "typeProperties": {
                                    "command": {
                                        "value": "@concat('.set ',item().TableName,' <| cluster(\"rakeshadx1.eastus\").database(\"',item().DatabaseName,'\").',item().TableName)",
                                        "type": "Expression"
                                    },
                                    "commandTimeout": "00:20:00"
                                },
                                "linkedServiceName": {
                                    "referenceName": "AzureDataExplorer2",
                                    "type": "LinkedServiceReference",
                                    "parameters": {
                                        "database_name": {
                                            "value": "@item().DatabaseName",
                                            "type": "Expression"
                                        }
                                    }
                                }
                            }
                        ]
                    }
                }
            ],
            "parameters": {
                "tables_arr": {
                    "type": "array"
                }
            },
            "variables": {
                "query": {
                    "type": "String"
                }
            },
            "annotations": []
        }
    }