I have a list of over 10,000 blobs that I'm returning through an API call. I'd like to loop through this API call returning 100 blobs each time until I've reached the end point because I would like to start using the files while the rest load in the background. Right now my code gets the full list of 10,000 blobs then returns just a section of them at a time but I'd like to return only the index of 100 I need from Azure Storage so that I'm not getting the full list every time. Is there a way to do this?
I don't really know how this helps but here's some code:
var container = GetContainer(prof.Container, dataCenterId)
var blobList = container.GetBlobs(prefix: prefix);
// blobList here contains all 10,000 blobs. I just want the section of n files I need.
foreach(var b in blobList)
{
if(fileList.Count() < nFilesToReturn)
{
fileNames.Add(b.Name)
}
else
{
return fileNames;
}
}
Firstly, I uploaded 150 blobs to the storage account container in the Azure portal below,
Then, I tried the below c# code to retrieve 100 blobs from the storage and was able to retrieve 100 blob's names from the storage account.
Code :
using Microsoft.WindowsAzure.Storage;
using Microsoft.WindowsAzure.Storage.Blob;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;
class Program
{
static async Task<IEnumerable<string>> GetBlobNamesInBatchAsync(CloudBlobContainer container, int batchSize, BlobContinuationToken continuationToken)
{
BlobResultSegment resultSegment = await container.ListBlobsSegmentedAsync("", true, BlobListingDetails.None, batchSize, continuationToken, null, null);
return resultSegment.Results.Select(blobItem => (blobItem as CloudBlob).Name);
}
static async Task Main(string[] args)
{
string storageConnectionString = "<connec_string>";
string containerName = "<container_name>";
int batchSize = 100;
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(storageConnectionString);
CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
CloudBlobContainer container = blobClient.GetContainerReference(containerName);
BlobContinuationToken continuationToken = null;
List<string> fileNames = new List<string>();
do
{
var batch = await GetBlobNamesInBatchAsync(container, batchSize, continuationToken);
fileNames.AddRange(batch);
continuationToken = null;
} while (continuationToken != null);
foreach (string fileName in fileNames)
{
Console.WriteLine(fileName);
}
}
}
Output :
It runs successfully and I retrieved the 100 blobs names as below,