Search code examples
jsonazureazure-cosmosdbazure-blob-storageazure-redis-cache

Azure cosmos db vs. blob storage for intermediate/temporary storage


We have requirement in the WebAPI, to pull the payload from external API in the form of JSON, cleanse and post it into Azure Sql. For this requirement we are currently relying on blob storage, where we are storing the json payload into an azure blob and retrieving it into the UI for data cleanup activity. User can spend good amount of time in validating the data and modifying it as required. User may draft it for some days, and clicks Import button when all the cleansing is done. Now, I have observed that, during those drafts the blob is just getting retrieved and deserialized into list of objects to find the corresponding properties to be updated. Once updates are done when user is clicked Draft, the same list gets serialized as json and stored back to blob. The process of serialization/de-serialization seems to be expensive. Instead I am thinking of replacing the blob with Cosmos DB. Will that really improvise the performance? Suggest if Azure Sql Json support is more feasible than all these options? I am even thinking of Redis Cache? The major factor of decission would be cost effectiveness as well.


Solution

  • You will have great performance advantage in using Cosmos DB if you like to search over JSON objects and search results are significantly less in size than the whole objects list. You will pay serialization / de-serialization price anyway for objects returned by a query as they should be sent over the network to your application.

    The price of Cosmos DB is much higher than standard Blob Storage but that is a very easy tool to work with JSON workloads. You have SQL and MongoDB queries APIs, somehow you will able to design database agnostic application (at least at queries levels).

    I think it make sense to use Redis Cache if you have JSON objects lists which are consulted more frequently than others, so you can preload them to cache take advantage of more performant search operations and upload to persitent Blob Storage later.