What is the best way to update a field of each existing value in a Ignite cache with data from another cache in the same cluster in the most performant way (tens of millions of records about a kilobyte each)?
Pseudo code:
try (mappings = getCache("mappings")) {
try (entities = getCache("entities")) {
entities.foreach((key, entity) -> entity.setInternalId(mappings.getValue(entity.getExternalId());
}
}
I would advise to use compute and send a closure to all the nodes in the cache topology. Then, on each node you would iterate through a local primary set and do the updates. Even with this approach you would still be better off batching up updates and issuing them with a putAll call (or maybe use IgniteDataStreamer).
NOTE: for the example below, it is important that keys in "mappings" and "entities" caches are either identical or colocated. More information on collocation is here: https://apacheignite.readme.io/docs/affinity-collocation
The pseudo code would look something like this:
ClusterGroup cacheNodes = ignite.cluster().forCache("mappings");
IgniteCompute compute = ignite.compute(cacheNodes.nodes());
compute.broadcast(() -> {
IgniteCache<> mappings = getCache("mappings");
IgniteCache<> entities = getCache("entities");
// Iterate over local primary entries.
entities.localEntries(CachePeekMode.PRIMARY).forEach((entry) -> {
V1 mappingVal = mappings.get(entry.getKey());
V2 entityVal = entry.getValue();
V2 newEntityVal = // do enrichment;
// It would be better to create a batch, and then call putAll(...)
// Using simple put call for simplicity.
entities.put(entry.getKey(), newEntityVal);
}
});