core-data ios5 relationship fault prefetch

Core Data Faulting and Fetching of To-Many Relationships

I have several "theory" questions on Core Data behavior related to what happens with a to-many relationship and when to rely on walking the relation from a parent entity and when a fresh fetch request should be built. They're all very much related.

Background

Assume a parent entity RPBook, which has a to-many relation to RPChapter. A book has many chapters. The inverse is set in the core data model too. A basic form of manually ordered relationships is involved, so the RPChapter entity has a chapterIndex property. I am not using iOS5's new ordered relationships here (not as relevant to these questions either).

To get to the chapters in a book, one would use the chapters relationship accessor:

RPBook *myBook; // Assume this is already set to an existing RPBook
NSSet *myChapters = myBook.chapters

Usage / Setup

In an iPhone app, we'd start off with a table view showing a list of RPBook instances. The corresponding chapters wouldn't be pre-fetched as part of the fetch spec for the fetched results controller backing the table view, since those chapters are not yet needed.

I now select one of those RPBook instances, I'm taken to a new page and I have this RPBook instance reference in my view controller, which does NOT have its chapters prefetched.

Question 1: Invoking `filteredSetUsingPredicate:` on `chapters` relation right away

If I want to filter via the chapters relation using filteredSetUsingPredicate: directly, will that even work reliably, given that I didn't pre-fetch all related RPChapter instances of the current RPBook I'm looking at? Put another way, does filteredSetUsingPredicate: trigger faulting behind the scenes of all objects in that relation in order to do its thing, or will it misleadingly only give me results based on which of the chapters already happened to be in memory (if any)?

If I don't have an egregious number of associated chapters to a book, should I instead style this by invoking allObjects first? i.e.

[[self.chapters allObjects] filteredArrayUsingPredicate:predicate]

instead of just:

[self.chapters filteredSetUsingPredicate:predicate]

Question 2: Batch retrieval of all a book's chapters

In the case I have an RPBook instance, but no pre-fetched RPChapter instances related to it, how do I force all of a book's chapters to be fetched in one shot using the chapters relation? Does [myBook.chapters allObjects] do that or can I still get faults back from that call?

I want Core Data to fulfill all the faults in a batch instead of tripping faults for the odd RPChapter asked for if that will affect the behavior of using filteredSetUsingPredicate: on the chapters relation, as per Question 1 above.

Must I resort to an explicit fetch request to do this? Should I refetch the RPBook I already have, but this time, request in the fetch request, that all associated chapters also be fetched using setRelationshipKeyPathsForPrefetching:?

This last option just seems wasteful to me, b/c I already have a scope relation representing conceptually the subset of all RPChapter instances I'd be interested in. As much as possible, I'd like to just walk the object graph.

Question 3: NSFetchedResultsController of RPChapter instances on the same Thread

Setup In this case I have an RPBook instance, but no pre-fetched RPChapter instances related to it (but they do exist in the Store). In the same view controller, I also have an NSFetchedResultsController (FRC) of RPChapter instances scoped to the very same book. So that's same thread, same managed object context.

Is an RPChapter instance from the FRC going to be the same object in memory as an RPChapter instance counterpart I retrieve from myBook.chapters, that shares the same ObjectID? Put another way, does the runtime ever fulfill managed object requests for the same ObjectID from the same MOC in the same Thread, using different physical objects in memory?

Question 4: Design pattern of installing an `NSFetchedResultsController` inside a Managed Object to serve queries for a relation

I'm trying to decide whether I should be able to service queries about a relationship whose contents are frequently changing (chapters in a book in my example) by using the built in chapters relation provided in my custom RPChapter managed object subclass, or if it's ever okay from a design/architecture perspective, to install an FRC of RPChapter instances onto the RPBook managed object class, to service queries efficiently about chapters in that book.

It's clearly cleaner if I could just rely on the chapters accessor in myBook instance, but it seems an FRC here might actually be more performant and efficient in situations where a large volume of destination entities in the to-many relation exist.

Is this overkill or is this a fair use of an FRC for querying an RPBook about its chapters in different ways? Somehow it feels like I'm missing the opportunity to walk the object graph simply. I'd like to be able to trust that the chapters relation is always up to date when I load my RPBook instance.

Solution

Question 1

Yes it will work. When you call [book chapters] the set will get populated automatically. When you filter on those objects they will fault in.

However, you should be using a NSFetchedResultsController here with the predicate being something like @"book == %@" instead of grabbing the array.

Question 2

The best way to force the NSManagedObjectContext to load all of the chapters would be to do a NSFetchRequest and configure the NSFetchRequest to return fully realized objects instead of faults. This will pre-load them all in one go. However, unless you have a TON of chapters, you are not going to get a lot of savings here.

Why?

Because when you request those chapters, even in a faulted state, Core Data is going to load the data into a cache (excluding a few edge cases like binary data) so that when you "fault" an object it is just pointers moving around in memory and no additional disk hit.

You would need, probably, thousands of chapters to see any benefit out of a pre-fetch.

Question 3

Yes. They will be the same object. NSManagedObject instances will always be shared when retrieved from the same NSManagedObjectContext. That is part of the job of the NSManagedObjectContext

Question 4

You want to use a NSFetchedResultsControler that is its job. Managing that stuff manually is wasteful and almost guaranteed to be less efficient than Core Data's implementation.

However, the relationship will always be up to date unless you are tweaking it from another thread. So if you do not expect updates then you could just use an array. I wouldn't.