I'm working on an iOS app where I download large data sets from a remote web service. I have a local core data database. I don't wan't to save duplicates of each object, so I therefore need to detect if an object with a specific value already exists locally.
The way I do it now works, but I'm noticing some performance issues. For each object I currently make a fetch request to see if it exists (~500+ times). My fetch looks like this:
+ (Entity *)entityWithIdentifier:(NSString *)identifier {
NSFetchRequest *fetchRequest = [NSFetchRequest fetchRequestWithEntityName:@"Entity"];
NSPredicate *predicate = [NSPredicate predicateWithFormat:@"identifier == %@", identifier];
fetchRequest.predicate = predicate;
[fetchRequest setFetchLimit:1];
NSError *error;
NSArray *array = [[AppDelegate sharedDelegate] backgroundContext] executeFetchRequest:fetchRequest error:&error];
return array.firstObject;
}
As mentioned, this method gets called often - for each remote object that I want to parse (create/insert or update).
It currently blocks the main thread for ~2000ms, and will be worse with a growing data set. What's the most efficient way to do what I want:
1. Check if object with value exists locally.
2. If it exists, return it, if not return nil and create a new local object.
My recommendation is to replace your 500+ fetches with a single fetch. First extract all downloaded IDs in a single array using KVC.
NSArray *downloadedIDs = [downloadedRecords valueForKeyPath:@"identifier"];
Then do a single fetch with this predicate:
[NSPredicate predicateWithFormat:@"identifier in %@", downloadedIDs];
Now you can iterate through the remaining IDs to create new objects. To get the remaining IDS is also very simple once you extract the existing IDs as above:
NSArray *existingIDs = [fetchedObjects valueForKeyPath:@"identifier"];
NSArray *remainingIDs = [downloadedIDs filteredArrayUsingPredicate:
[NSPredicate predicateWithFormat:@"self not in %@", existingIDs]];
There are some possible further optimisations, e.g. by just fetching the identifier properties, but in my experience the above should solve your performance problems.