Search code examples
azure-data-explorer

Separation of clusters for ingestion and export


Considering that a cluster is dominated by ingestion processes in terms of memory and cpu usage , is it better to have a separate follower cluster dedicated to only export? The use case is to export huge amount of data out of an ADX cluster by letting all the nodes participate in export. In other words, is there any disadvantage in using a follower cluster for export that the leader cluster itself? Or it will be a better strategy to simply scale up/out the main(leader) cluster itself for facilitating heavy export without having to do it through a follower cluster ? What is the best way to optimize export in this case? The export is to an external table which points to a storage in the same region as the cluster.


Solution

  • I suggest scaling up/out the existing cluster, instead of creating a follower cluster. It will allow you easier management and you'll pay less.

    To have efficient export, the recommendation is to export into parquet format, and use the useNativeParquetWriter flag, see more details here.