I have a Reliable Dictionary partitioned across a cluster of 7 nodes. [60 partitions]. I've setup remoting listener like this:
var settings = new FabricTransportRemotingListenerSettings
{
MaxMessageSize = Common.ServiceFabricGlobalConstants.MaxMessageSize,
MaxConcurrentCalls = 200
};
return new[]
{
new ServiceReplicaListener((c) => new FabricTransportServiceRemotingListener(c, this, settings))
};
I am trying to do a load test to prove Reliable Dictionary "read" performance will not decrease under load. I have a "read" from dictionary method like this:
using (ITransaction tx = this.StateManager.CreateTransaction())
{
IAsyncEnumerable<KeyValuePair<PriceKey, Price>> items;
IAsyncEnumerator<KeyValuePair<PriceKey, Price>> e;
items = await priceDictionary.CreateEnumerableAsync(tx,
(item) => item.Id == id, EnumerationMode.Unordered);
e = items.GetAsyncEnumerator();
while (await e.MoveNextAsync(CancellationToken.None))
{
var p = new Price(
e.Current.Key.Id,
e.Current.Key.Version, e.Current.Key.Id, e.Current.Key.Date,
e.Current.Value.Source, e.Current.Value.Price, e.Current.Value.Type,
e.Current.Value.Status);
intermediatePrice.TryAdd(new PriceKey(e.Current.Key.Id, e.Current.Key.Version, id, e.Current.Key.Date), p);
}
}
return intermediatePrice;
Each partition has around 500,000 records. Each "key" in dictionary is around 200 bytes and "Value" is around 600 bytes. When I call this "read" directly from a browser [calling the REST API which in turn calls the stateful service], it takes 200 milliseconds. If I run this via a load test with, let's say, 16 parallel threads hitting the same partition and same record, it takes around 600 milliseconds on average per call. If I increase the load test parallel thread count to 24 or 30, it takes around 1 second for each call. My question is, can a Service Fabric Reliable Dictionary handle parallel "read" operations, just like SQL Server can handle parallel concurrent reads, without affecting throughput?
If you check the Remarks about Reliable Dictionary CreateEnumerableAsync Method, you can see that it was designed to work concurrently, so concurrency is not an issue.
The returned enumerator is safe to use concurrently with reads and writes to the Reliable Dictionary. It represents a snapshot consistent view
The problem is that concurrently does not mean fast
When you make your query this way, it will:
When you have a huge number of queries running this ways, many factors will take in place:
The best way to work with Reliable Dictionary is retrieving these values by Keys, because it knows exactly where the data for a specific key is stored, and does not add this extra overhead to find it.
If you really want to use it this way, I would recommend you design it like an Index Table where you store the data indexed by id in one Dictionary, and another dictionary with the key being the searched value, and value being the key to the main dicitonary. This would be much faster.