Search code examples
sql-serverappfabric

Why use AppFabric when denormalized SQL Server data seems to perform as well?


I am working on an eCommerce website designed to present a large number of SKUs. The SQL Server schema describing these products is normalized to the extent that, a few years ago, it became unreasonably slow to retrieve the necessary information to present to customers, so we changed our infrastructure such that we would bear the cost of loading the data for each product once and then store that data in an AppFabric cache (previously Velocity).

Over time, the complexity of requirements placed on our AppFabric infrastructure has grown (imagine that), forcing us to spend a considerable amount of time writing code for handling data retrieval from our cache, data updates including incremental updates, etc.

We happen to have much of our product data stored in a denormalized form in a side database, so for experimentation's sake I wrote a console app to randomly select one of our ~150K SKUs at a time, and then retrieve the record for that product from our denormalized table.

I was surprised to find that I was able to select these records in about the same average time that I could select a record from our AppFabric cache, about 2.5 ms average in both cases. I'm sure in both cases the data is coming from an in-memory cache of one sort or another, be it AppFabric or disk cache, and the 2.5 ms is bumping against a bare minimum amount of time for a network round trip.

This makes me think we might be better off just using denormalized data in SQL Server for our high load/high performance needs. The management tools for SQL Server-based data are so much better. All of the devs on our team are adept at using Management Studio, whereas with AppFabric we have one dev who can use PowerShell to a) Give us a count of records stored in the cache and b) dump the cache. Any other management functionality we have to create ourselves.

This makes me ask why anyone would want to use AppFabric at all. We are not concerned with cost, because the cost of the development efforts we have to apply to an AppFabric-related solution vastly outweigh even the cost of SQL Server licensing.

Thank you for whatever feedback you can provide to help our team decide the best direction to move forward.


Solution

  • Deciding to use a caching mechanism should be a very thought out process -- and isn't really always the right choice. However, the primary reason for using caching over a durable persistance model is to manage an extremely high transaction load.

    In AppFabric Cache I can setup a distributed set of servers to work off of one logical repository -- with built in load balancing. So, unlike Microsoft SQL Server which has no way of providing clustered instances for the purpose of load balancing -- if I'm reading and writing 50 to 100 million times a day the cache is a more viable solution for sharing those resources. Then those writes can be queued to the durable persistence model over time ensuring that there are no real peaks in usage because it's spread out both across the caching fabric and the durable store.