I am evaluating the performance of Service Fabric with a Reliable Dictionary of ~1 million keys. I'm getting fairly disappointing results, so I wanted to check if either my code or my expectations are wrong.
I have a dictionary initialized with
dict = await _stateManager.GetOrAddAsync<IReliableDictionary2<string, string>>("test_"+id);
id
is unique for each test run.
I populate it with a list of strings, like "1-1-1-1-1-1-1-1-1", "1-1-1-1-1-1-1-1-2", "1-1-1-1-1-1-1-1-3".... up to 576,000 items. The value in the dictionary is not used, I'm currently just using "1".
It takes about 3 minutes to add all the items to the dictionary. I have to split the transaction to 100,000 at a time, otherwise it seems to hang forever (is there a limit to the number of operations in a transaction before you need to CommitAsync()
?)
//take100_000 is the next 100_000 in the original list of 576,000
using (var tx = _stateManager.CreateTransaction())
{
foreach (var tick in take100_000) {
await dict.AddAsync(tx, tick, "1");
}
await tx.CommitAsync();
}
After that, I need to iterate through the dictionary to visit each item:
using (var tx = _stateManager.CreateTransaction())
{
var enumerator = (await dict.CreateEnumerableAsync(tx)).GetAsyncEnumerator();
try
{
while (await enumerator.MoveNextAsync(ct))
{
var tick = enumerator.Current.Key;
//do something with tick
}
}
catch (Exception ex)
{
throw ex;
}
}
This takes 16 seconds.
I'm not so concerned about the write time, I know it has to be replicated and persisted. But why does it take so long to read? 576,000 17-character string keys should be no more than 11.5mb in memory, and the values are only a single character and are ignored. Aren't Reliable Collections cached in ram? To iterate through a regular Dictionary of the same values takes 13ms.
I then called ContainsKeyAsync
576,000 times on an empty dictionary (in 1 transaction). This took 112 seconds. Trying this on probably any other data structure would take ~0 ms.
This is on a local 1 node cluster. I got similar results when deployed to Azure.
Are these results plausible? Any configuration I should check? Am I doing something wrong, or are my expectations wildly inaccurate? If so, is there something better suited to these requirements? (~1 million tiny keys, no values, persistent transactional updates)
Ok, for what it's worth:
Not everything is stored in memory. To support large Reliable Collections, some values are cached and some of them reside on disk, which potentially could lead to extra I/O while retrieving the data you request. I've heard a rumor that at some point we may get a chance to adjust the caching policy, but I don't think it has been implemented already.
You iterate through the data reading records one by one. IMHO, if you try to issue half a million separate sequential queries against any data source, the outcome won't be much optimistic. I'm not saying that every single MoveNext() results in a separate I/O operation, but I'd say that overall it doesn't look like a single fetch.
It depends on the resources you have. For instance, trying to reproduce your case on my local machine with a single partition and three replicas, I get the records in 5 seconds average.
Thinking about a workaround, here is what comes in mind:
Chunking I've tried to do the same stuff splitting records into string arrays capped with 10 elements(IReliableDictionary< string, string[] >). So essentially it was the same amount of data, but the time range was reduced from 5sec down to 7ms. I guess if you keep your items below 80KB thus reducing the amount of round-trips and keeping LOH small, you should see your performance improved.
Filtering CreateEnumerableAsync has an overload that allows you to specify a delegate to avoid retrieving values from the disk for keys that do not match the filter.
Hopefully it makes sense.