I've had a problem for a while and I have hacked together a solution but I am revisiting it in the hopes of finding a real solution. Unfortunately that is not happening. In Core Data I've got a bunch of RSS articles. The user can subscribe to individual channels within a single feed. The problem is that some feed providers post the exact same article in multiple channels of the same feed. So the user ends up getting 2+ versions of the same article. I want to keep all articles in case the user unsubscribes from a channel that contains one copy but stays subscribed to another channel with a duplicate, but I only want to show a single article in the list of available articles.
To identify duplicates, I create a hash value of the article text content and store it as a property on the Article entity in Core Data (text_hash). My original thinking was that I would be able to craft a fetch request that could get the articles based on a unique match on this property, something like an SQL query. That turns out not to be the case (I was just learning Core Data at the time).
So to hack up a solution, I fetch all the articles, I make an empty set, i enumerate the fetch results, checking if the hash is in the set. If it is, I ignore it, if it isn't, i add it to the set and I add the article id to an array. When I'm finished, I create a predicate based on the article ids and do another fetch.
This seems really wasteful and clumsy, not only am i fetching twice and enumerating the results, since the final predicate is based on the individual article ids, I have to re-run it every time I add a new article.
It works for now but I am going to work on a new version of this app and I would like to make this better if at all possible. Any help is appreciated, thanks!
You could use propertiesToGroupBy
like so:
NSFetchRequest *fr = [NSFetchRequest fetchRequestWithEntityName:@"Article"];
fr.propertiesToGroupBy = @[@"text_hash"];
fr.resultType = NSDictionaryResultType;
NSArray *articles = [ctx executeFetchRequest:fr error:nil];