I have an Ignite cache:
IgniteCache<String, Record> cache;
A collection of keys of this cache is given. I need to do the following:
name
has value John
')One way I tried was using getAll()
method and applying filtering on my side:
cache.getAll(keys).values().stream()
.filter(... filter logic...)
.collect(toList());
This works, but if the additional filter has high selectivity (i.e. it rejects a lot of data), we'll waste a lot of time on sending unneeded data via network.
Another option is using a scan:
cache.query(new ScanQuery<>(new IsKeyIn(keys).and(new CustomFilter())))
This makes all the filtering work at the server nodes side, but it is a full scan, and if there are many entries in the cache, while the input keys only constitute a small fraction of it, a lot of time is wasted again, this time on the unneeded scanning.
And there is invokeAll()
which allows to filter on the server nodes side:
cache.invokeAll(new TreeSet<>(keys), new AdditionalFilter())
.values().stream()
.map(EntryProcessorResult::get)
.collect(toList());
where
private static class AdditionalFilter implements CacheEntryProcessor<String, Record, Record> {
@Override
public Record process(MutableEntry<String, Record> entry,
Object... arguments) throws EntryProcessorException {
if (... record matches the filter ...) {
return entry.getValue();
}
return null;
}
}
It finds entries by their keys, it executes filtering logic at server nodes side, but on my data it is even slower than the scanning solution. I suppose (but not sure) this is due to invokeAll()
being possibly an updating operation, so (according to its Javadoc) it takes locks on the corresponding keys.
I would like to have ability to find entries by given keys, apply additional filtering at the server nodes side and not pay for additional locks (as in my case it's a read-only operation).
Is it possible?
My cache is distributed among 3 server nodes, and its atomicity is TRANSACTIONAL_SNAPSHOT
. The reads are done under transaction.
SQL is the simplest solution, and possibly the fastest, given proper indexes.
IgniteCompute#broadcast
+ IgniteCache#localPeek
:
Collection<Key> keys = ...;
Collection<Collection<Value>> results = compute.broadcast(new LocalGetter(), keys);
...
class LocalGetter implements IgniteClosure<Collection<Key>, Collection<Value>>
{
@Override public Collection<Value> apply(Collection<Key> keys) {
IgniteCache<Key, Value> cache = ...;
Collection<Value> res = new ArrayList<>(keys.size());
for (Key key : keys) {
Value val = cache.localPeek(key, CachePeekMode.PRIMARY);
if (val != null && filterMatches(val)) {
res.add(val);
}
}
return res;
}
}
This way we retrieve cache entries efficiently by key, then apply the filter locally, and only send matching entries back over the network. There are only N network calls, where N is the number of server nodes.