Search code examples
objective-cjsoncore-dataautomatic-ref-countingnsdata

Excessive memory footprint when storing JSON (as NSData) on Core Data Object via ARC in Core Data


      I am loading several hundred MB filesets into Core Data.  Instead of using relational joins, I've discovered that I get much better perfomance by creating NSDictionaries / NSArray's, and serializing them onto the Core Data records.   The problem thus far is taht my memory footprint goes through the roof due ot malloc's attributed to this line:  

NSMutableDictionary *stackEntryDictionary = [NSMutableDictionary dictionary];
NSDictionary *stackDictionary = [NSDictionary dictionaryWithObjectsAndKeys:
                                [NSString stringWithUTF8String:parts[6].c_str()], @"relationship"
                                , [NSString stringWithUTF8String:parts[8].c_str()], @"sequenceId"
                                , [NSString stringWithUTF8String:parts[9].c_str()], @"sequence"
                                , [NSString stringWithUTF8String:parts[7].c_str()], @"block"
                                , [NSNumber numberWithInteger:row], @"entryId"

                                , nil ];
if ([relationship isEqualToString:@"consensus"] || [relationship isEqualToString:@"model"]) {
                            [stackEntryDictionary setObject:stackDictionary forKey:relationship];
                            row = 1;
                        }
                        else {
                            [stackEntryDictionary setObject:stackDictionary forKey:[NSNumber numberWithInt:row].stringValue];
                            ++row;

                        }


stackEntryDatumMO = [NSEntityDescription insertNewObjectForEntityForName:@"StackEntryDatum" inManagedObjectContext:document.managedObjectContext];
stackEntryDatumMO.sampleId = sampleMO.sampleId;
stackEntryDatumMO.name = sampleMO.name;

stackEntryDatumMO.tagId = [NSNumber numberWithInteger:locusId];

// THIS IS THE "BAD LINE" that issues a lot of NSString malloc's 

stackEntryDatumMO.stackData = [NSJSONSerialization dataWithJSONObject:stackEntryDictionary options:0 error:&error];

  This gets cleaned up when I am fully out of the loop (about 70K of these per file, for 30 files, and the average size of the insert dictionary is roughly 20).   However, I'm sporting about 10 GB's of memory which kills application performance and is unnecessary.  

So, I have two questions: 1 - How would you suggest that I embed objects onto Core Data (or do you suggest)?  2 - Is there a better JSON serialization library that would offer a lower memory footprint?  3 - Should I abandon ARC and if so how? 4 - Any other suggestions?   Using a separate import "Application" would be a possible alternate interim solution, but not something that I'd be willing to see posted on the App-Store long-term.  

I should also mention that when ARC cleans up after the load, it will kill the application as it tries to release objects that have already been released (can post error later if need be).   This does not happen on smaller files.  


Solution

  • Try wrapping the NSJSONSerialization call in an @autoreleasepool, it creates a ton of autoreleased objects that aren't needed after the call.

    As for what you're doing, storing them as NSData will incur the same memory footprint in CoreData as storing them in their own file. You'll pay a bit more per object for the stuff required by NSManagedObject but the net cost is approximately the same.

    You can tune the footprint by forcing the CoreData objects to deallocate more frequently, the most abrupt way to do this is to save and then deallocate the managed object context you are using. Apply @autoreleasepool as necessary.

    It would also help if you can post a sample of the Instruments allocations instrument while in the loop, or a snapshot of the heap.