Search code examples
consistencyazure-data-lake

Consistency of Azure Data Lake Store


What is the consistency guarantees of Azure Data Lake Store? Has anyone found technical documentation describing it?

I am in particular interested in whether directory moves are atomic, whether directory listings are consistent, and whether files are read-after-write consistent.


Solution

  • In Azure Data Lake Store, files have a read-after-write consistency (also sometimes referred to strong consistency). Directory listings are also strongly consistent.

    Directory and file rename operations are atomic. This includes moving directories/files to a different parent. The only caveat to this behavior is when the destination of the rename operation already exists and the OVERWRITE option is used. In this condition the rename operation is not atomic. More information on the rename with OVERWRITE option is [located here](https://hadoop.apache.org/docs/r2.6.1/api/org/apache/hadoop/fs/FileSystem.html#rename(org.apache.hadoop.fs.Path, org.apache.hadoop.fs.Path, org.apache.hadoop.fs.Options.Rename...)).

    -Azure Data Lake Store Team