Search code examples
azureazure-blob-storageazure-eventhub

How can I remove event hub partitions from Azure blob storage through code?


I am using Azure Event Hubs in a C# Winforms project.

I create EventProcessorHost and EventReciever objects to carry out the work of retrieving messages from the event hub and displaying them.

Part of my message retrieval process involves creating a new consumer group on my Event Hub when my form is opened. (I just make the consumer group name a new GUID).

All of this^ works.

When the form is closed, the consumer group is deleted from the Event Hub, and this is validated by viewing the Event Hub through the portal.

However, the partition objects used by the consumer groups to do the Event Hub work still exist in the Storage Account.

When going through CloudBerry explorer, I see this:

enter image description here

Where each GUID is a consumer group. There are hundreds here over the last few months of my development, but an Event Hub only can contain 20 active consumer groups at a time.

Inside each consumer group folder is 4 files with information pertaining to each of the 4 partitions used by that consumer group.

Is there an API call on an Event Hub object (EventReceiver, EventProcessorHost, etc.) that can clean these up for me in an automated way? I have looked but have not found anything and documentation on Event Hubs is currently minimal.

I looked at EventProcessorHost.PartitionManagerOptions.SkipBlobContainerCreation = true but this did not help.

If not, is there a setting on the storage account that needs to be set to avoid this buildup of junk?

Thanks!


Solution

  • I got this to work in the end.

    This is really just deleting blobs from a storage account with a slight twist.

    First, when creating your IEventProcessor objects, you need to store away their lease information:

        Task IEventProcessor.OpenAsync(PartitionContext context)
            {
            Singleton.Instance.AddLease(context.Lease);
            Singleton.Instance.ShowUIRunning();
            return Task.FromResult<object>(null);
            }
    

    Where "Singleton" is just a singleton object I have created where multiple threads can dump their information. Singleton's 'Add Lease' implementation:

        public void AddLease(Lease l)
            {
            if (!PartitionIdToLease.ContainsKey(l.PartitionId))
                {
                PartitionIdToLease.Add(l.PartitionId, l.Token);
                }
            else
                PartitionIdToLease[l.PartitionId] = l.Token;
            }
    

    Where 'PartitionIdToLease' is a

    Dictionary<string, string>
    

    Now, the delete code:

    CloudStorageAccount acc = CloudStorageAccount.Parse("Your Storage Account Connection String");
    CloudBlobClient client = acc.CreateCloudBlobClient();
    CloudBlobContainer container = client.GetContainerReference("Name of Event Hub");
    CloudBlobDirectory directory = container.GetDirectoryReference("Name of Folder");
    
    
    foreach (IListBlobItem item in directory.ListBlobs())
                {
                if (item is CloudBlockBlob)
                    {
                    CloudBlockBlob cb = item as CloudBlockBlob;
                    AccessCondition ac = new AccessCondition();
                    string partitionNumber = cb.Name.Substring(cb.Name.IndexOf('/') + 1); //We want the name of the file only, and cb.Name gives us "Folder/Name"
    
                    ac.LeaseId = Singleton.Instance.PartitionIdToLease[partitionNumber];
    
                    cb.ReleaseLease(ac);
                    cb.DeleteIfExists();
                    }
                }
    

    So now every time my application closes, it is responsible for deleting the junk it generated in the storage account.

    Hope this helps someone