Search code examples
azureazure-blob-storageazure-data-lake-gen2azure-storage-account

60x storage consumtion spike in Azure ADLS Gen2 within 10 days


The storage consumption on our ADLS Gen2 rose from 5 TB to 314 TB within 10 days and has maintained steady since then. It has just 2 containers:- $logs container and a container with all directories for data storage. The $logs container looks empty. I have tried looking at Folder Statistics in Azure Storage Explorer on the other container and it does not seem any of the directories is big enough.

Interestingly, one of the directories was running the Folder Statistics for few hours so I cancelled it. On cancellation, partial result showed 200+ TB and 88k+ blobs in it. I did a visual inspection of the directory and there were just a handful of blobs that would barely sum up to 1 GB. This directory had been present for months without issue. Regardless, I deleted this directory and checked the storage consumption after a few hours but could not see any change. Snap showing 200 TB+ on the partial result of Folder Statistic

This brings to questions:-

  1. If I cancel an ongoing Folder Statistic, could it show an incorrect partial result (in the above case it showed 200TB whereas it looked barely 1 GB in reality)? I have done it on previous occasions but even the partial stats seemed feasible.
  2. Could there be hidden blobs in ADLS Gen2 that might not show up on visual inspection? (I have Read, Write, Delete access if that matters)
  3. I have run Folder Statistic on Azure Storage Explorer for all folders individually. But is there a better way to get the storage consumption at one go (at least classified for directory and their sub-directory level - I suppose blob level would be overkill but whatever works). I have access to Databricks with mount point to this container and can create a cluster with the required runtime if such code is specific to one.

For reference:- 60x_StorageSpike_Within10Days


Solution

  • Update: We found the cause of the increase. It was, in fact, a few copy activities created by our team. Interestingly, when we deleted it, it took about 48 hrs before the storage graph actually started going down although the files disappeared immediately. This was not a delay in re4freshing the consumption graph, rather it actually took that time before we saw expected sharp dip in storage. We raised a Microsoft case and they confirmed that such an amount of data can take time to actually delete in the background.