Search code examples
swiftalgorithmcore-datacomplexity-theorylookup

Improve performance of Array ordering by IDs


I have a function:

func findExistingModels(ids: [String]) throws -> [Model?] {
    let models: [Model] = try getModelsFromCache(ids: ids)
    // ???
}

As you can see, it calls getModelsFromCache which roughly looks like this:

class Model: NSManagedObject {
    let id: String
}

func getModelsFromCache(ids: [String]) throw -> [Model] {
    let fetchRequest = NSFetchRequest<Model>(entityName: "Model")
    fetchRequest.fetchLimit = ids.count
    fetchRequest.predicate =  NSPredicate(format: "%K IN %@", argumentArray: ["id", identifiers])

    // some CoreData code, like performAndWait
    // Just ignore all the CoreData concurrency aspects

    return models
}

As you can see, getModelsFromCache returns and array of Model which will contain anywhere from 0 to ids.count amount of Model objects, depending on their existence in the storage.

What I need findExistingModels to do is:

func findExistingModels(ids: [String]) throws -> [Model?] {
    let models: [Model] = try getModelsFromCache(ids: ids)

    return ids.map { id in
        if let model = models.first(where: { model in
            model.id == id
        }) {
            return model
        }
        return nil
    }
}

Put in words, I need to map the ids to their corresponding Models with the goal of preserving the sorting based on the incoming ids, and also, I need to place nils corresponding to the ids for which the Models haven't been found.

This algorithm that I wrote, with first search, it works, but it's not sound from asymptotic complexity perspective. At worst, this function would produce the complexity of O(pow(ids.count, 2))

My question is:

  • what is the more effective algorithm for this kind of compartmentalizing?
  • does CoreData NSPredicate with IN preserve the order of appearing keys after FetchRequest - i.e. can I assume that for ids ["1", "2", "3"] the array that will be returned from NSManagedObjectContext.fetch will be in order of 1, 2, 3 in terms of the ids of the returned objects - and maybe we can build an effective algorithm around that?

I haven't tried yet, but I have a couple of ideas/suggestions:

Drain Pool

  • Turn the models array into a slice
  • look up via firstIndex
  • retrieve the item via index
  • delete the item from models so that next time, for next id, we won't search over it

Example:

var models = ArraySlice(getCache())
var results = [Model?](repeating: nil, count: ids.count)

ids.enumerated().forEach { index, id in
    if let modelIndex = models.firstIndex(where: { model in
        model.id == id
    }) {
        let model = models[modelIndex]
        models.remove(at: modelIndex)
        results[index] = model
    }
}

return results

Do you think this is a good idea? Can't quite calculate the complexity in my head...

Set

  • make Model hashable, where hash is calculated as id, literally just the value of it
  • turn array into set
  • look up the value by the hash...? somehow?

Maybe this will be faster?


Solution

  • Use a dictionary, as per @Alexander comments.