Search code examples
.netlinqlazy-loadingazure-blob-storageazure-sdk-.net

How to query Cloud Blobs on Windows Azure Storage


I am using Microsoft.WindowsAzure.StorageClient to manipulate blobs on Azure storage. I have come to the point where the user needs to list the uploaded files and modify/delete them. Since there are many files in one container, what is the best way to query azure storage services to return only the desired files. Also, I would like to be able to return only specific number of blobs so I can implement paging.

There is a method called ListBlobs in the CloudBlobContainer, but it seems like it's returning all of the blobs in the container. That will not work for me.

I searched a lot on this topic and could not find anything useful. This link shows only the basics.

--------- EDIT

My answer below does not retrieve the blobs lazily, but it retrieves all of the blobs in the container and then filters the result. Currently there is no solution for retrieving blobs lazily.


Solution

  • What I've realized about Windows Azure blob storage is that it is bare-bones. As in extremely bare-bones. You should use it only to store documents and associated metadata and then retrieve individual blobs by ID.

    I recently migrated an application from MongoDB to Windows Azure blob storage. Coming from MongoDB, I was expecting a bunch of different efficient ways to retrieve documents. After migrating, I now rely on a traditional RDBMS and ElasticSearch to store blob information in a more searchable way.

    It's really too bad that Windows Azure blob storage is so limiting. I hope to see much-enhanced searching capabilities in the future (e.g., search by metadata, property, blob name regex, etc.) Additionally, indexes based on map/reduce would be awesome. Microsoft has the chance to convert a lot of folks over from other document storage systems if they did these things.